Common IRC bot URL title vulnerabilities

Carriage return in title:
Correct responses: Incorrect responses: Solution: strip carriage return and newline characters from the title before printing it

Valid but uncommon tag formats:
Correct responses: Incorrect responses: Solution: use a proper HTML parser or a more robust regex (e.g. something like <title[^>]*>([^<]*)</title\s*> in case-insensitive mode), then decode HTML entities in it; also see hard mode, which probably requires a HTML parser

No <title> tag:
Correct responses: Incorrect responses: Solution: handle the case where a title tag cannot be found

Long title messages:
Correct responses: Incorrect responses: Solution: always truncate the result before sending it in a message (maximum IRC message size, including source, command and args, is 512 bytes, so maximum length is somewhere around 450)

Large file size:
Correct responses: Incorrect responses: Solution: if handling non-HTML pages, use at least 64-bit integers to store filesize, get filenames from Content-Disposition and don't read the page content

IP address:
Correct responses: Incorrect responses: Solution: there is no viable way to detect any representation of the IP (e.g. if the string from the above link will be hidden, use this link instead and base64-decode the result)

CTCP messages:
Correct responses: Incorrect responses: Solution: strip ASCII SOH (byte 0x01) from the start and end of the message or prefix the title with some string

Infinite redirect:
Correct responses: Incorrect responses: Solution: have a limit on the number of followed redirects

1 GiB HTML page (but HEAD returns Content-Length: 42):
Correct responses: Incorrect responses: Solution: only read some of the page (e.g. 16 KiB); in most sane pages, the title will be at the beginning; also, have a timeout in case the page loads too slowly

Page with title at the beginning, followed by a gigabyte of data:
Correct responses: Incorrect responses: Solution: only read the start of the page (e.g. 16 KiB) and try to find the <title> tag in that, even if it wasn't the whole page

1 GiB of small headers:
Correct responses: Incorrect responses: Solution: include headers in your timeout and/or read size limit

Extremely long header
Correct responses: Incorrect responses: Solution: set your limits on actual data read, not just number of headers

Compose a <title> message: