Hidden Services in TOR
Tor is a service run by a network of volunteers to allow people to use internet anonymously. Normally tor is used to browse the web without being tracked or identified.
One less known feature of the tor service is the ability to provide what is known in tor as hidden services. Hidden services are basically servers that provide services through the tor network. When you think about tor the first thing you think of is anonymous web browsing. However, for hacktivists and dissidents it is very useful not only to be able to browse the web without being identified, but also providing web pages for people in a way that such webpages can not be tracked or shutdown easily.
In the tor network there are thousands of ‘hidden services’ accessible only for people using the tor network, providing access to forbidden information about very different topics. Those sites have a hidden DNS address with the .onion tld, for example example.onion. Sites ending in .onion can not be easily tracked or shutdown, and the owner can not be easily identified.
One of the most complex things about setting up a hidden service, is configuring the web server in a way that doesn’t leak information about the real IP address of the server, or the country location etc. The more complex the site, the more difficult it becomes to setup a real hidden service that doesn’t leak service information in any way.
During the last years, the F.B.I. has been able to identify and shutdown certain hidden services, using social engineering, information leaks and browser vulnerabilities. The most famous example is The Silk Road, a well known black market hidden inside tor, used for selling drugs and similar stuff.
Of course, the administrators behind hidden services try its best to not leak any information about the physical location of the server providing the service, or any other information that could lead to the identification of the owner of the hidden service.
Leaking the timezone
The HTTP protocol allows the client to inform the server about its compression capabilities. If the client and server share support for a specific compression format, the server can decide to compress the http response in order to save bandwidth and time. All major web servers and browsers support compression. The most common formats used for HTTP compression are gzip and DEFLATE.
Gzip is a compression format that allows relative fast data compression with decent compression ratios.
As a compression format, gzip specifies a data header to be included in the resulting compressed data, this header includes information about the compressed data, the operating system that compressed the data, and most importantly: the date when the data was compressed.
The header is as follows, as you can see in Foreniscs Wiki:
Offset | Size | Value | Description |
---|---|---|---|
0 | 2 | 0x1f 0x8b | Magic number to idenitfy gzip streams |
2 | 1 | Compression method | |
3 | 1 | Flags | |
4 | 4 | Compression Date | |
8 | 1 | Compression flags | |
9 | 1 | Operating system |
So, if this header is present in any gzip compressed data, we can make a gzip compressed request to any webserver, wait for the gzip compressed response, check if the bytes starts with 0x1f 0x8b, and check for
the compression date to know the exact date configured at the server that serves the page.
With normal webservers, this is only useful in a very limited scenarios, because the geopraphical position of the server is not hidden in any way, and can be known easily knowing the server IP address, that is not hidden at all. However, in a Hidden Service, the information about the server timezone can be very useful to identify the possible countries where the server is running.
This, of course, its NOT a TOR fault and its not a bug in the tor protocol or anything like that, its just a obscure feature of the gzip format, available in the HTTP Protocol by default in most web servers.
The good news is that lots of webservers are preconfigured to fill the date field of the gzip header with ‘0’s, maybe because of performance issues, who knows. After some research, I found that around 10% of the webservers leak the remote date when compressing HTTP Responses with gzip.
For testing purposes, instagram.com, reddit.com and bing.com leak the remote date in the gzip encoded http response.
Of course, because of privacy concerns, I’m not going to provide information on which hidden services are leaking the remote date.
Proof Of Concept
I have developed a little php script that uses curl (command line) to get the remote server date if available in the gzip compressed HTTP Response. It will only work in web server that allows for compression of HTTP Responses, and fills the ‘date’ field of the gzip header with the correct date instead of zeroes.
Examples of use:
user@localhost:~$ php time.php bing.com
The server that processed the request on: bing.com has local date set to:
Sunday 21st of February 2016 01:21:21 PM
user@localhost:~$ php time.php reddit.com
The server that processed the request on: reddit.com has local date set to:
Sunday 21st of February 2016 09:21:25 PM
user@localhost:~$ php time.php instagram.com
The server that processed the request on: instagram.com has local date set to:
Sunday 21st of February 2016 09:21:30 PM
user@localhost:~$
The Proof of concept is available here: