bug-gnu-utils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: large file support in diff utils


From: Chuck Swiger
Subject: Re: large file support in diff utils
Date: Tue, 12 Apr 2005 23:40:55 -0400
User-agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.7.6) Gecko/20050319

Paul Eggert wrote:
Here's a quote from doc/diffutils.texi:
[ ...snip... ]

Thanks for the refererence. I decided I was curious enough that I wanted to write a little test harness to see what happens in practice:

  http://www.pkix.net/~chuck/difftest.py

This is changing one line out of 100 between the two files:

8-sec% ./difftest.py -v -l /usr/tmp/dt
INFO: beginning diff trial run with ratio = 100
filea_size=10485760 (aka 10.000 MB)
time=1.896 filea_size=10MB diff_size=818KB
filea_size=15728640 (aka 15.000 MB)
time=2.880 filea_size=15MB diff_size=1229KB
filea_size=23592960 (aka 22.500 MB)
time=6.287 filea_size=22MB diff_size=1844KB
filea_size=35389440 (aka 33.750 MB)
time=33.905 filea_size=33MB diff_size=2768KB
filea_size=53084160 (aka 50.625 MB)
time=90.391 filea_size=50MB diff_size=4163KB
filea_size=79626240 (aka 75.938 MB)
Killed
NOTICE: diff exitted with errno 137
time=141.118 filea_size=75MB diff_size=1616KB
302.30s real  81.60s user  14.90s system  31%

[ This is on a 1GHz P3 w/256 MB RAM and 500MB datasize & IDE disks, limited swap. ]

...this changes one line in ten:

16-pi% ./difftest.py -v -r 10
INFO: beginning diff trial run with ratio = 10
filea_size=10485760 (aka 10.000 MB)
time=2.099 filea_size=10MB diff_size=8121KB
filea_size=15728640 (aka 15.000 MB)
time=5.504 filea_size=15MB diff_size=11MB
filea_size=23592960 (aka 22.500 MB)
time=11.174 filea_size=22MB diff_size=17MB
filea_size=35389440 (aka 33.750 MB)
time=16.881 filea_size=33MB diff_size=26MB
filea_size=53084160 (aka 50.625 MB)
time=57.728 filea_size=50MB diff_size=40MB
filea_size=79626240 (aka 75.938 MB)
time=214.114 filea_size=75MB diff_size=60MB
filea_size=119439360 (aka 113.906 MB)
time=349.576 filea_size=113MB diff_size=91MB
filea_size=179159040 (aka 170.859 MB)
diff: memory exhausted
NOTICE: diff exitted with errno 2
time=51.396 filea_size=170MB diff_size=0KB
801.97s real  474.19s user  30.29s system  62%

[ A similar box, only using 2 10K SCSI disks (PERC4/aac in RAID-1) and plenty of swap. ]

...and finally:

INFO: beginning diff trial run with ratio = 100
filea_size=10485760 (aka 10.000 MB)
time=4.861 filea_size=10MB diff_size=874KB
filea_size=15728640 (aka 15.000 MB)
time=7.665 filea_size=15MB diff_size=1285KB
filea_size=23592960 (aka 22.500 MB)
time=11.762 filea_size=22MB diff_size=1936KB
filea_size=35389440 (aka 33.750 MB)
time=31.658 filea_size=34MB diff_size=2913KB
filea_size=53084160 (aka 50.625 MB)
time=47.598 filea_size=51MB diff_size=4378KB
filea_size=79626240 (aka 75.938 MB)
time=68.977 filea_size=76MB diff_size=6575KB
filea_size=119439360 (aka 113.906 MB)
time=140.650 filea_size=115MB diff_size=9872KB
filea_size=179159040 (aka 170.859 MB)
time=424.586 filea_size=172MB diff_size=14831KB
filea_size=268738560 (aka 256.289 MB)
time=546.334 filea_size=258MB diff_size=22332KB
filea_size=403107840 (aka 384.434 MB)
time=957.059 filea_size=388MB diff_size=33582KB
filea_size=604661760 (aka 576.650 MB)
diff: memory exhausted
NOTICE: diff exitted with errno 2
time=105.728 filea_size=582MB diff_size=0KB
5610.90s real  3268.63s user  1761.90s system  89%

This was on a Sun E450 running Solaris 8 with 2 GB of RAM and 3 GB of swap,
diff (GNU diffutils) 2.8.1, on a 4-disk RAID-10 array using 15K SEAGATE-ST336752LC for the data, and a 2-disk RAID-1 using 10K OEM Sun 18GB'ers. Resources unlimited.

--
-Chuck






reply via email to

[Prev in Thread] Current Thread [Next in Thread]