bug-tar
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Bug-tar] GNU tar, star and BSD tar speed comparision +script


From: Joerg Schilling
Subject: Re: [Bug-tar] GNU tar, star and BSD tar speed comparision +script
Date: Tue, 23 Oct 2007 11:01:09 +0200
User-agent: nail 11.22 3/20/05

Tim Kientzle <address@hidden> wrote:

> > Either BSD tar is incredible fast on tarball listing, or or took
> > extraorbitarily long to issue an error message...
>
> When reading uncompressed tar archives stored in regular files,
> bsdtar uses lseek() operations to skip over the bodies of
> files.  Of course, when reading compressed archives or reading
> from tape, it can't use this optimization.  (The code for this
> was contributed by a user who regularly backs up his systems to

If you get the code for free and it work reliably, it is OK, but
if you have to work on it you need to think about priorities.

> very slow external USB disk drives; in that situation, this
> optimization makes an even bigger difference.)

If you have a fast dual buffered tar implementation like star, seek
"optimizations" are hard to implement in a 100% reliable way. Reliability 
is more important than speed. This is why I did experiment ~ 6 years until
I made the FIFO in star the default in 1994.

On the other side, uncompressed and seekable tar archives are rare.
For slow devices like USB attached memory, compression may give you better
results in many cases. CPUs are fast these days...


There are plans to implement seeks in star since at least 12 years but
this is a low priority feature. The features that have been added to star
during the past years (e.g. higly reliable incremental backups/restores
and a built-in find(1)) give much more benefit to the typical user.

When talking about optimizations: If you create POSIX.1-2001 extended tar 
archives (called "pax archives") or derived formats, the amount of user
CPU time increases by 3x. Star has been optimized for POSIX.1-2001 to 
only consume the typical amount of user CPU time that other tar implementations
need for "ustar" type archives. Star was one of few if not the only tar 
implementation that did not suffer from a buffer overflow attack with 
POSIX.1-2001 extended headers publshed recently. This helps to convince people
to switch to the new format.

BTW: as a result of the shared memory and two process architecture in star
some ideas are hard to implement. If you ever think about archive feature
enhancements take this into account and make sure to choose a way that 
is easy to implement in a program like star.

Jörg

-- 
 EMail:address@hidden (home) Jörg Schilling D-13353 Berlin
       address@hidden                (uni)  
       address@hidden     (work) Blog: http://schily.blogspot.com/
 URL:  http://cdrecord.berlios.de/old/private/ ftp://ftp.berlios.de/pub/schily




reply via email to

[Prev in Thread] Current Thread [Next in Thread]