Does anyone know what are the Git limits for number of files and size of files?
|
|
|
This message from Linus himself can help you with some other limits
See more in my other answer: the limit with Git is that each repository must represent a "coherent set of files", the "all system" in itself (you can not tag "part of a repository"). As illustrated by Talljoe's answer, the limit can be a system one (large number of files), but if you do understand the nature of Git (about data coherency represented by its SHA-1 keys), you will realize the true "limit" is a usage one: i.e, you should not try to store everything in a Git repository, unless you are prepared to always get or tag everything back. For some large projects, it would make no sense. For a more in-depth look at git limits, see "git with large files" The three issues that limits a git repo:
A more recent thread (Feb. 2015) illustrates the limiting factors for a Git repo:
|
|||||||||||||||||||||
|
|
If you add files that are too large (GBs in my case, Cygwin, XP, 3 GB RAM), expect this.
More details here Update 3/2/11: Saw similar in Windows 7 x64 with Tortoise Git. Tons of memory used, very very slow system response. |
|||||
|
|
There is no real limit -- everything is named with a 160-bit name. The size of the file must be representable in a 64 bit number so no real limit there either. There is a practical limit, though. I have a repository that's ~8GB with >880,000 and git gc takes a while. The working tree is rather large so operations that inspect then entire working directory take quite a while. This repo is only used for data storage, though, so it's just a bunch of automated tools that handle it. Pulling changes from the repo is much, much faster than rsyncing the same data.
|
|||||
|
|
Back in Feb 2012, there was a very interesting thread on the Git mailing list from Joshua Redstone, a Facebook software engineer testing Git on a huge test repository:
Tests that were run show that for such a repo Git is unusable (cold operation lasting minutes), but this may change in the future. Basically the performance is penalized by the number of |
|||||
|
|
It depends on what your meaning is. There are practical size limits (if you have a lot of big files, it can get boringly slow). If you have a lot of files, scans can also get slow. There aren't really inherent limits to the model, though. You can certainly use it poorly and be miserable. |
|||
|
|
|
I think that it's good to try to avoid large file commits as being part of the repository (e.g. a database dump might be better off elsewhere), but if one considers the size of the kernel in its repository, you can probably expect to work comfortably with anything smaller in size and less complex than that. |
||||
|
|
|
I have a generous amount of data that's stored in my repo as individual JSON fragments. There's about 75,000 files sitting under a few directories and it's not really detrimental to performance. Checking them in the first time was, obviously, a little slow. |
|||
|
|
|
I found this trying to store a massive number of files(350k+) in a repo. Yes, store. Laughs.
The following extracts from the Bitbucket documentation are quite interesting.
The recommended solution on that page is to split your project into smaller chunks. |
|||
|
|
|
git has a 4G (32bit) limit for repo. |
|||||
|