Home

> urticator.net
  Search

  About This Site
> Domains
  Glue
  Stories

> Computers
  Driving
  Games
  Humor
  Law
  Math
  Numbers
  Science

  Concepts for Persistent Objects
  Language Design Principles
> Miscellaneous

  An Owning Pointer
  Finalization in Java
> Cookies
  What Is Lambda?
  Hierarchical Namespaces
  How I Learned to Write Comments

Cookies

As a public service, I would like to present the following complete explanation of how cookies work. I was provoked into writing this essay by a paragraph that appeared recently in the paper. Unfortunately, I didn't think to record the date, or even which paper it was, so I can't give credit properly.

Search for drug terms like “grow pot” on some Internet sites, and an ad banner that pops up from the drug office may drop a “cookie” program in your computer that tracks your online activities.

Calling a cookie a program is so wrong! This, on top of all the other misinformation one sees about cookies, was the last straw.

To explain how cookies work, I'm going to show you the entire cookie process, using actual data I collected just now. Before I get into the details, though, I'd like to make a couple of comments on cookie policy.

First, as I understand it, there are plenty of commercial products and systems that use cookies for various standard purposes right out of the box. I expect that's what happened with the drug office. It's certainly possible that cookies could be used in an attempt to find drug users, but that would require both intent and technical competence, and I doubt either is present.

Second, although I seem to be saying cookies are just fine, I actually prefer to avoid them, because they're mostly used as an element of advertising, and advertising, as a distracting component of consumerism, offends me. Recently I've been making use of Internet Junkbuster, a nice little proxy server that can block both ads and cookies. For “nice”, here, read free, open-source, reliable, small, and fast. The only downside is that it takes a bit of effort to configure; maybe one day I'll post my minimal configuration files here.

Now, on to the explanation of cookies!

Before getting into the details, it'd probably help if I outlined the process. Whenever a browser sends a request to a server (roughly, once per page plus once per image), the server can tell the browser to create one or more cookies. Then, whenever the browser sends another request to that same server (or, actually, to any server that matches the cookie), it includes the cookie information in the request.

So, with Internet Junkbuster turned off, I opened a new browser window and went to the following page, which I had previously bookmarked. (It's an article on Wired that I found while reading news on Lycos, not that it matters.)

http://www.wired.com/news/lycos/0,1306,37435,00.html

What the browser did, at this point, was send a HTTP (HyperText Transfer Protocol) request to the server www.wired.com. Here is the complete text of the request.

GET http://www.wired.com/news/lycos/0,1306,37435,00.html HTTP/1.0
Proxy-Connection: Keep-Alive
User-Agent: Mozilla/4.05 [en] (Win95; I)
Host: www.wired.com
Accept: image/gif, image/x-xbitmap,
   image/jpeg, image/pjpeg, image/png, */*
Accept-Language: en
Accept-Charset: iso-8859-1,*,utf-8

And here's the HTTP response the server sent back.

HTTP/1.1 200 OK
Date: Sun, 09 Jul 2000 16:07:41 GMT
Server: Apache/1.3.9 (Unix)
Set-Cookie: p_uniqid=5gmoyODX+g0G3nbm0A;
   path=/; domain=.wired.com;
   expires=Thu, 31-Dec-2037 23:59:59 GMT
Cache-Control: no-cache="Set-Cookie"
Connection: close
Content-Type: text/html

<html>

   :

</html>

This isn't the complete text of the response, because I cut out what would normally be the important part: the actual article, presented in HTML format. What's important for us is the line that begins with Set-Cookie—that's how the server tells the browser to create a cookie. Accordingly, the cookie was now supposed to be in my cookie file, cookies.txt, and, indeed, there it was. (This is the entire file, since I started with no cookies.)

# Netscape HTTP Cookie File
# http://www.netscape.com/newsref/std/cookie_spec.html
# This is a generated file!  Do not edit.

.wired.com TRUE / FALSE 2145916631 p_uniqid 5gmoyODX+g0G3nbm0A

I then re-requested the same page, and, as expected, there the cookie was, tacked on to the end of the request message. (The rest of the request message was exactly the same.)

Cookie: p_uniqid=5gmoyODX+g0G3nbm0A

That's all there is to it … no programs, just some little pieces of data.

The only other interesting thing about the mechanism is that cookies do have some internal structure. The actual cookie content (p_uniqid=5gmoyODX+g0G3nbm0A) is pretty much opaque to the browser, but the other fields are understood, and are used to determine which requests the cookie should be added to. For details, see (e.g.) the cookie specification mentioned in the cookie file.

By the way, if you're looking at the example in detail, it may interest you to know that the number 2145916631 is the representation of the expiration date as the number of seconds after some base time around 1970. (I expected the base time to be midnight, January 1, 1970, but it didn't work out quite right, there were an extra 168 seconds somewhere.) It may also interest you to know that the largest positive number that can be stored in a standard four-byte variable is

231 − 1 = 2147483647,

which corresponds to some time in mid-January 2038. (This is why there will be a Y2038 bug.)

I was going to joke that the actual cookie content is opaque not only to the browser, but also to me. However, it actually isn't opaque to me; it is clearly just a stereotypical use of cookies for user tracking. The server has assigned me a unique identifier (p_uniqid … “p” for person?), and the browser is supposed to attach this identifier to all future requests I make. Of course, this doesn't tell me how the tracking information is being used; it could be anything from just counting the number of readers all the way up to remembering which articles I've requested in the past and then inserting targeted ads into the HTML content.

 

  See Also

  Powers of 2

@ August (2000)