[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Bug-tar] Re: GNU TAR or STAR -- speed comparision

From: Joerg Schilling
Subject: [Bug-tar] Re: GNU TAR or STAR -- speed comparision
Date: Thu, 18 Oct 2007 17:14:25 +0200
User-agent: nail 11.22 3/20/05

"Jan Psota" <address@hidden> wrote:

> > If you know a fater tar implementation than star, send the name,
> > a test and a bug report! 
> Here You are:
> I made some tests to find out, which tar implementation is faster:
>       star-1.5_alpha85
> or
>       tar-1.19
> Both programs were compiled with gcc-4.2.2 on Gentoo and with CFLAGS set to
> "-O2 -march=athlon-xp -mfpmath=sse -frename-registers -pipe"

I did never really do performance analysis with star because it just was the 
fastest known tar implementation..... 

Now I did some more testing to check your rresults.

I first used the default optimization from the native packet and cc. In this
case, GNU tar takes aprox. 3x the user CPU time than star, so I compiled
GNU tar again with "cc -fast". This reduced the use CPU time for GNU tar
to be between 1.6x and 2.0x the time star needs. As star typically needs
exactly the same system CPU time as GNU tar and as star typically only needs
from 55% .. 65% of the user CPU time GNU tar needs, it should be obvious that
star is faster. In special as star implements better buffering.

> All tests were executed on 1500MHz Athlon-XP, 512M SDRAM on KM133,
> 40G Maxtor (50MB/s).

OS ?????

I used a quad core Opteron system (two Operon 880) at 2400 MHz 4 GB of 
registered ECC RAM and a 300 GB Maxtor that deliveres ~ 60 MB/s on Solaris 
Nevada build 51 (I know this is a bit old...).

> -----------------------------------------------------------------------------
>       creating archives [about 72MB, 22000 files]

I did all my tests with ~ 3.3 GB from /usr (144403 files)

> [2G]
> tar   real    2m56.360s       user    0m0.460s        sys     0m27.634s
> star  real    2m54.600s       user    0m0.272s        sys     0m41.227s
> tar   real    2m52.244s       user    0m0.448s        sys     0m27.550s
> star  real    2m48.909s       user    0m0.360s        sys     0m41.251s
> tar   real    2m51.499s       user    0m0.488s        sys     0m27.690s
> star  real    2m49.207s       user    0m0.288s        sys     0m40.651s

With /dev/null (or /dev/zero because of anomal GNU tar behavior) and even
with real files, I get the same system CPU time for GNU tar and star. 
Your results look strange.

With real output files, I get using microstate accounting from ptime(1):

GNU tar:
real     9:31.649
user        2.831
sys        30.794

real     6:03.418
user        1.549
sys        30.056

Using real tapes instead of files makes this difference even bigger.

Your user CPU times look OK, but please try to find out why star takes more 
system CPU time on your system.

If you have no other ideas, try to e.g. add "-no-fifo" to star's options
and compare to the non "-no-fifo" case to check for the multi process overhead
on your OS.

>       listing archive contents
> command:      like above, except:
>       time $c tf t1.tar > /dev/null
> tar   real    0m0.346s        user    0m0.172s        sys     0m0.172s
> star  real    0m0.559s        user    0m0.108s        sys     0m0.448s
> tar   real    0m0.346s        user    0m0.128s        sys     0m0.216s
> star  real    0m0.554s        user    0m0.108s        sys     0m0.444s

With listing, there seems to be a strange effect with star.

When using larger block sizes (e.g. 1 MB), star is _much_ faster than 
GNU tar (nearly twice as fast).

When using the default block size of 10 kB, Star is ~ 20% slower than GNU tar
(regarding wall clock time). The interesting thing is that star seems to 
spend this time in the same number of read(2) calls to the input file as GNU 
tar does.

It seems this is the only place where I need to check for the reason as star
definitely should never be slower than GNU tar.

>       scanning for differences
> command:      instead of inner loop:
>       time tar df ~/t1.tar ; time star f ~/t1.tar --diff diffopts=data
> tar   real    0m1.097s        user    0m0.280s        sys     0m0.812s
> star  real    0m1.613s        user    0m0.360s        sys     0m1.224s
> tar   real    0m1.071s        user    0m0.288s        sys     0m0.784s
> star  real    0m1.639s        user    0m0.400s        sys     0m1.232s

When "diffing" the 3.3 GB archive from /usr against /usr, I get with ptime(1):

GNU tar:
real     6:42.457
user        3.591
sys        23.137

real     5:53.672
user        3.114
sys        22.510

> Conclusion:
> 1. On big archives star is from 2% to 5% faster.

when operating on files, star is typically 10%-30% faster than GNU tar

> 2. When operating on files cached in memory, GNU tar is about half time 
> faster.

This seems to be impossible except if there is a non-deterministic behavior in 
your OS.

I see only 5%-10% deviations from call to call which is the expected standard 

> 3. Using STAR or GNU TAR is a question of habit, because both are comparable.

star implements a lot more features than GNU tar 

Note: If you plan to compare file extraction performance, you need to use 
"star -no-fsync ..." to make star behave as unreliable as GNU tar does.

There are even more differences here in the behavior related to unlinking
files. If you like to get comparable results for the extract case, please
better call star as "tar" to switch it to the unlink behavior of a classical 
UNIX tar.


 EMail:address@hidden (home) Jörg Schilling D-13353 Berlin
       address@hidden                (uni)  
       address@hidden     (work) Blog: http://schily.blogspot.com/
 URL:  http://cdrecord.berlios.de/old/private/ ftp://ftp.berlios.de/pub/schily

reply via email to

[Prev in Thread] Current Thread [Next in Thread]