For web developers, exposing your .git folder to the world is a novice mistake. It allows anyone to download your entire source code repository, which often includes database passwords, salts, hashes, and third party API keys or usernames and passwords.
Over the years, for another personal project, I’ve built up a database of 1.5m reasonably respected domains. These are all either authority sites (such as the BBC or Guardian, or government, educational or military domains), or are domains that inbound links from at least one of those types of sites.
Out of those 1.5m sites, 2,402 have their .git folder exposed and downloadable. That’s 1 in 600 decent respectable sites, or 0.16% of the internet, that is dangerously exposed.
Some of these .git repositories are harmless, but from a random sample many contain dangerous information that provides a direct vector to attack the site. Hundreds listed database passwords, or included API keys for services such as Amazon AWS or Google Cloud. Others included FTP details to their own web server. Many contained database backups in .SQL files, or the contents of hidden folders that are meant to be restricted.
One prominent human rights groups exposed every single person who had signed up to a gay rights campaign (including their home address and email addresses) in a CSV file in their Git repository, publicly downloadable from their website. One company that sold digital reports provided its entire database of reports free of charge to anyone who wanted to download their .git folder.
So developers, please, please check that your .git folder is not visible on your website at http://www.yourdomain.com/.git/. If it is, lock it down immediately. Ideally delete the folder and find a better way to deploy your code, or at least make sure access is forbidden using an .htaccess. Then assume that someone has downloaded everything already and work out what they could have seen. What passwords, salts, hashes or API keys do you need to change? What data could they have accessed? What could they have done to alter or impair your service?
And then please spread the word among other developers too – because right now this must be one of the biggest holes in the internet.
Great post and thoughts!
I’ve found a very useful tool for a case where you find out that somewhere in the history of your repo you did expose some keys is the BFG-repo-clener (https://rtyley.github.io/bfg-repo-cleaner/) which will rewrite your repo’s history *way* faster than the regular git approach and will allow you to keep your repo history while not expose your stuff any longer.
Still, step #1 should always be to reset-regenerate keys of anything you were using and *always* use env variables for keys instead of files.
all good here!
I discovered almost the exact same problem in a very large production site once, except it was the .svn directory. I wondered how bad the problem was, but never got around to testing it. You’d think this would be a standard check in Nessus & the like.
For an article pleading with users to hide their .git repos, you don’t provide even a .htaccess or nginx rule showing how.
Any chance you could run same study for subversion and mercurial?
If one still wants to use git, it should be better to move the .git out of the web-root completely and specify –git-dir and –work-tree when working with the repository. This way the .git cannot be exposed even if the .htaccess gets deleted or something similar.