Let's Talk about HTTP Cookies

Why do we need cookies?

Cookies were invented by Netscape in the mid-90's as a way to store some user information in the browser rather than requiring the web server to store it.  

HTTP is a stateless protocol which means it remembers nothing between different request/response exchanges. The server handles each request as a separate distinct connection. The web server doesn't remember if you logged in already, set a preferred language, etc.

The use cases for cookies have evolved over the past two decades and they are now commonly used for:

  • Session Management
  • Personalization
  • Tracking Activity for Serving Ads

What's in a cookie?

A cookie is simply text data that's stored by the browser and contains the following information:

  • A name-value pair containing the actual data (e.g. id = 1034820810)
  • An expiration date that sets the lifetime of the cookie (UTC/GMT format). If no expiration date is set, then the cookie is discarded once the user closes their browser.
  • The domain and path of the server it belongs to. This is used to restrict which site the cookie can be shared with.
  • (Optional) Secure instructs the browser to send the cookie only over HTTPS
  • (Optional) HttpOnly prevents the cookie from being accessible via any client-side code.  This is used to prevent malicious scripts from stealing cookies with sensitive data like sessions.

Here's a screenshot from Chrome Developer Tools for the cookies set by nytimes.com:

How is a cookie created?

To create a cookie, the server adds a Set-Cookie header to its HTTP response:

HTTP/1.1 200 OK 
Content-Type: text/html; charset=utf-8
Date: Thu, 13 Oct 2016 16:03:41 GMT
Set-Cookie: RMID=007f010109ee57ff1a960007; path=/; domain=.nytimes.com; expires=Fri, 13 Oct 2017 05:24:38 UTC

Once the browser has stored the cookie, it is sent as part of every every subsequent request to the same server as the content of a Cookie header in the HTTP request:

GET / HTTP/1.1
Host: www.nytimes.com
Cookie: RMID=007f010109ee57ff1a960007;
User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/53.0.2785.143 Safari/537.36

Notice that the client only sends the name-value pair and none of the attributes. This is because the attributes are instructions from the server to the browser about how and when to send the cookie back. 

Where are cookies stored in browsers?

The location varies by browser. Most modern browsers uses a SQLite database file to store cookies on disk. 

Size Limits

For widespread support, it's good practice not to exceed 50 cookies per domain and 4093 bytes per domain.

document.cookie

You can use JavaScript to create, read, and delete cookies using the document.cookie property.   

document.cookie = "username=amir.boroumand; expires=Sat, 31 Dec 2016 12:00:00 UTC; path=/; HttpOnly; Secure"

Remember that you won't be able to retrieve a cookie that has the HttpOnly flag set.

Parsing a cookie can get ugly in JavaScript since the property returns a string containing all the cookies for that site. You have to split up the string knowing that cookies are delimited with a semicolon.

Session cookies

A session cookie does not contain an expiration date. These cookies are stored in memory and not written out to disk. When the browser closes, the cookie is discarded.

Below are some common session ID names from popular web application servers:

  • PHPSESSID (PHP)
  • JSESSIONID (J2EE)
  • ASP.NET_SessionId (ASP .NET)

Persistent cookies

A persistent cookie expires after a fixed date. They are not cleared when the user closes their browser.

First party cookies

A first party cookie is restricted to the same domain as the website you are viewing. For example, if you were visiting nytimes.com, a first party cookie would only be readable by pages inside nytimes.com.

Third party cookies

Cookies can be set only by the domain the browser retrieves content from. However, most commercial websites include content from other websites in the form of ads.

When you visit nytimes.com, the browser will also fetch content from an ad network like Google DoubleClick. This allows DoubleClick to set its own cookie in the browser to log which ads it has shown to you.

Third party cookies are most commonly used for tracking users by advertising networks, search engines, and social media sites.

EU Cookie Law

Companies who have operations in the EU must comply with a privacy law that requires disclosure of cookie usage. If you to go google.co.uk, you'll notice a privacy notice appear at the bottom of the page. More information about the law can be found here.