10

I am trying to create a patch using two large size folders (~7GB).

Here is how I'm doing it :

$ diff -Naurbw . ../other-folder > file.patch

But maybe due to file sizes, patch is not getting created and giving an error:

diff: memory exhausted

I tried making space more than 15 GB but still the issue persists. Could someone help me out with the flags that I should use?

| improve this question | |
  • 2
    I googled "diff large files linux" and found this among lots of other links, you could at least pretend to have done some research... this is also off-topic. – Thomas Mar 7 '13 at 5:55
  • 1
    yes, i'd tried googling it and found some parameter changes and the "memory exhausted" error is still there, even if using "--speed-large-files" flag. – pritam Mar 7 '13 at 6:15
  • 1
    How about diffing them in multiple steps? e.g. split the folders into, say, 1GB blocks, diff, then concatenate the patch, though I'm not sure if diff can be split like that (so you might need some extra logic to apply the patch). Why are you diffing 7GB folders in the first place? Surely only some files/folders inside it have changed? – Thomas Mar 7 '13 at 6:19
  • yes, i tried diffing them separately and creating differebt patches abd merging them but the patch does not get applied. While creating a single patch size of patch goes to 800KB but after merging it becomes 90KB and it's not getting applied. – pritam Mar 7 '13 at 6:37
  • @pritam check out my answer above, sir – Igor Apr 8 '15 at 7:33
17

Recently I came across this too when I needed to diff two large files (>5Gb each).

I tried to use 'diff' with different options, but even the --speed-large-files had no effect. Other methods like splitting the files into smaller ones, using xdelta or sorting the files as per this suggestion didn't help either. I even got my hands around a very powerful VM (> 72Gb RAM), but still got this memory exhausted error.

I finally got to work by adding the following parameter to sysctl.conf (sudo vim /etc/sysctl.conf):

vm.overcommit_memory=1

vm.overcommit_memory has three values (0,1,2) and sets the kernel virtual memory accounting mode. From the proc(5) man page:

0: heuristic overcommit (this is the default)
1: always overcommit, never check
2: always check, never overcommit

To make sure that the parameter is indeed applied you can run

sudo sysctl -p

Don't forget to change this parameter back when you finish!

| improve this answer | |
  • 1
    I agree, interesting, non-standard, and it worked for me! Comparing two 70GB files, I see e.g. 317TB virtual and 150TB resident RAM... a comparison that could not run before even with 250GB RAM now completes. Very clever! – David W Mar 17 '16 at 17:02
  • Excelent, also you can add swap space askubuntu.com/questions/349156/… – YOGO Mar 1 '19 at 8:21
  • 1
    Thank you. It worked for me like this: cat /proc/sys/vm/overcommit_memory, echo 1 > /proc/sys/vm/overcommit_memory, diff, echo 0 > /proc/sys/vm/overcommit_memory, cat /proc/sys/vm/overcommit_memory – matt Jul 4 at 20:27
1

bsdiff is slow & requires large memory, xdelta is create large delta for large files.

Try HDiffPatch for large files: https://github.com/sisong/HDiffPatch

  • support diff between large binary files or directories;
  • can run on: Windows, macos, Linux, Android
  • diff & patch both support run with limit memory;

Usage example:

  • Creating a patch: hdiffz -s-256 [-c-lzma2] old_path new_path out_delta_file
  • Applying a patch: hpatchz old_path delta_file out_new_path
| improve this answer | |
0

Try sdiff. It's a pre-built tool in some Linux Distributions.

sdiff a.txt b.txt --output=c.txt

will show the files to be Modified.

This worked perfectly for me.

| improve this answer | |

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service, privacy policy and cookie policy

Not the answer you're looking for? Browse other questions tagged or ask your own question.