Abusing Gmail to get previously unlisted e-mail addresses

A classic user enumeration attack on Gmail that allowed me to retrieve thousands of e-mail addresses

Sample of collected data

tl;dr: I discovered a glitch that allowed me to guess, in large number, existing Google accounts addresses that could otherwise be unknown.
DISCLAIMER: it’s just bruteforce that wasn’t properly rate-limited, nothing too fancy, so if you’re looking for some juicy 0day please pass along 😉

The glitch

The non rate-limited URL is https://mail.google.com/mail/gxlu. I noticed that providing an non existant user/e-mail will trigger a different header response from the server. For example here with a valid account you’ll get:

And with a non existant account:

The two requests will show equivalent HTTP 204 No Content replies but a request against an existing account the server shall add a Set-Cookie header.

Guessing valid addresses

So obviously I decided to write a Python script to abuse this… 😏

The main idea was to lookup for firstname.lastname@gmail.com addresses that are likely to exist. First step: getting a list of known firstnames and lastnames. Thanks to Facebook and a 2010 information leak such lists are publicly available. Another idea was to generate fake personas using randomuser.me and check if they can match existing accounts.

This way I was able to guess around 40,000 valid e-mail addresses per day with a stupid unoptimized PoC.

Why would this pose a privacy threat?

The issue here is that a large part of these addresses are unknown to the public. People might want to get their privacy respected and not being spammed by bots, right?
Bruteforcing can be limited easily: captcha, rate-limiting, etc. You already get these protections on most of the Google services, take an example of Google over Tor, there are captchas everywhere!

How likely is an unlisted e-mail to be known/pwned?

I checked using the haveibeenpwned.com API for pwned addresses. The idea was to get the probability for a random and valid e-mail address to be in some kind of leaked database. Interestingly it is not that likely!

Only 8.41% of tested addresses were in such database. Also worth noting that all the e-mails I got aren’t necessarily active right now, my list may contain some old and unused addresses. Here are the breaches an e-mail is most likely to be in:
- River City Media Spam List (4,98%)
- SC Daily Phone Spam List (2,63%)
- LinkedIn (2,46%)
- Dropbox (1,52%)
- MySpace (1,37%)
- Adobe (1,33%)
- Modern Business Solutions (1,14%)
- Special K Data Feed Spam List (1,01%)
- Tumblr (0,73%)
- Last.fm (0,51%)

Impact

This bug could be exploited by malicious people: there are scenarios involving — in the best case scenario — guerrilla marketing campaigns (ie. receiving unsolicited e-mail) or worse, phishing and ransomware attacks, as usual.

Google response

02/03/2017 14:54:00 (UTC+1): contact Google with this issue
02/03/2017 17:13:00 (UTC+1): Google reply “your report was triaged and we’re currently looking into it.”
02/03/2017 17:27:00 (UTC+1): Google reply “ decided to route it internally to a team that deals with similar issues”
22/03/2017 00:56:00 (UTC+1): Google reply “We haven’t forgotten about your report; we’re still looking into it and will get back to you in a couple of days.”
31/03/2017 16:29:00 (UTC+1): Google decided not to classify this as a security bug ¯\_(ツ)_/¯