bug-tar
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Bug-tar] tar 1.23: Problem under Solaris 10 - incorrect GNU header


From: Tim Kientzle
Subject: Re: [Bug-tar] tar 1.23: Problem under Solaris 10 - incorrect GNU header contents
Date: Fri, 04 Jun 2010 13:28:42 -0700
User-agent: Mozilla/5.0 (X11; U; FreeBSD i386; en-US; rv:1.8.1.23) Gecko/20100314 SeaMonkey/1.1.18

Apologies, I should be a little more careful in my wording.

I have a bad habit of referring to the current default
format emitted by GNU tar as the "old GNU format", in order
to contrast it with the pax extended format that GNU has
been adding support for.

The GNU folks have something they call "old GNU format" which
is truly ancient and should not be used.

So please ignore the word "old" in my earlier descriptions
and accept my apologies for the confusion.

My warning about negative times applies to inconsistencies
between different programs.  GNU tar is consistent with
itself about this.

Hmmm...  Have you tried using GNU tar on your Windows
system to extract these files?  It's available through
the Cygwin system.

If you still have access to that old Solaris system, it
might well be worth trying to back up the data using a
couple of different programs---including native tar,
cpio, and zip---just to be on the safe side.

Tim


Armistead, Jason wrote:
Tim

Your information on the formats really helped.  It should be distributed with 
the GNU tar source.  After all, those are the specifications that it implements.

So, both GNU and "old' GNU formats will output the    'u' 's' 't' 'a' 'r' space 
space null sequence - correct ?

It's still not clear to me from the GNU tar source code how that sequence at offset 257 
is being generated for the GNU and "old" GNU formats (which is probably what 
added to my confusion).  Can you point me to the particular lines ?  If I had a debugger 
I would try stepping the code.  I'm trying to wind down this old legacy box, not do more 
with it .... sigh !

I did manage to figure out that some files that were being saved had modification 
times of 15 December 1942 & December 20 1942 (according to ls -la).  The 
mtime[] data is no longer valid octal.  I suspect this was part of what 7-Zip and 
WinZIP were unhappy about.  Your web pages warn about negative times.  It's a pity 
that GNU tar doesn't at least throw a warning message to stderr when it encounters 
problems like this.  It probably shouldn't encode them in an invalid way, but just 
store these out-of-range times as the beginning of the epoch.  Thoughts ?

Regards
Jason

-----Original Message-----
From: Tim Kientzle [mailto:address@hidden Sent: Friday, 4 June 2010 1:12 PM
To: Armistead, Jason
Cc: Dustin J. Mitchell; address@hidden
Subject: Re: [Bug-tar] tar 1.23: Problem under Solaris 10 - incorrect GNU 
header contents

If you're looking for details about tar formats,
I wrote up a lengthy man page with a lot of
details about tar format variants.

There are online versions at the libarchive Wiki:

http://code.google.com/p/libarchive/wiki/ManPageTar5

and at the FreeBSD project man page reference:

http://www.freebsd.org/cgi/man.cgi?query=tar&sektion=5&manpath=FreeBSD+8.0-RELEASE&format=html

The mdoc-to-HTML translations seem to have some minor problems, though.
If you don't have access to a FreeBSD system, you might find the
mdoc source to be helpful:

http://code.google.com/p/libarchive/source/browse/trunk/libarchive/tar.5

In answer to your original question, the old "GNU tar" format
violates the POSIX ustar specification in several respects.
(GNU tar came out around the same time as the first POSIX
specification.)  Most obviously, it sets the 8 bytes
starting at offset 257 to:
   'u' 's' 't' 'a' 'r' space space null
where POSIX ustar archives set those same 8 bytes to:
   'u' 's' 't' 'a' 'r' null '0' '0'

The GNU tar format also does not use the ustar
'prefix' field as specified in POSIX and has non-POSIX
extensions for handling long filenames, long linknames,
and sparse files.  The mechanism used for sparse
files, in particular, can cause tar implementations
that don't understand this extension to lose header
synchronization.

More recently, GNU tar has added support for the
"pax extended format" which is specified by current
POSIX standards.  You can request this format with
the --posix flag to current versions of GNU tar.
Despite the "pax" name, this is really an extended tar
format that has been broadly adopted.  It was also
carefully designed so that programs that understood
the old ustar format but do not recognize the pax
extensions would still be able to extract the files
contained in the archive (they would just not restore
any additional file metadata).

Hope this helps,

Tim


Armistead, Jason wrote:
Dustin wrote:

The entire original email was focused on ustar functionality, by my
read.  Perhaps you can repeat your experiment, bearing in mind that
you're expecting a GNU Tar archive, and let us know what happens?
My original experiment WAS with GNU formatted tar archives.  Some work, and 
some don't.  I have far larger tar files that are working OK.  But this one, 
from a very important filesystem, is not.  That is what led me to look more 
closely at the bytes in the file header records.

With regard to my e-mail, I made a newbie blunder (having never looked under the hood of 
tar before), and assumed that because the resulting files contained "ustar" in 
the header, they must have been in Ustar format.

If I'm correct it my understanding, a GNU formatted achive should also contain "ustar" 
(followd by a null) at offset 257 and "00" at offset 263.  Is this correct for GNU format 
archives ?

Also, 7-zip claims to support "TAR" format, but doesn't say which
format - are you sure it's designed to support GNU Tar archives?  If
you create a tar file with --format=ustar, can you read it with 7-zip?
7-Zip is decidedly vague on what sort of TAR it supports.  I now have the 
source code, but it still doesn't explain what TAR format(s) it supports. Time 
permitting, I'll try to instrument it to figure out where it's breaking, and to 
understand what format(s) it supports.  7-Zip's author didn't leave many 
comments in his code, and doesn't have the ability to conditionally add in 
debugging.  It could take me some time.

7-Zip will read many other TAR files. I have been able to download many of them from the Internet without problems.
My concern is, that for whatever reason, my Solaris 10 box with GNU tar 1.14 or 
1.23 produces what appear to be incorrect contents in the two fields I 
mentioned.

From what I've seen of 7-Zip's source, it isn't checking that the "ustar" and 
"00" fields are correct.  But, nevertheless, my installations of GNU tar are NOT 
producing the same binary output for these as other TAR files I get off the internet.  This 
troubles me, and makes me wonder what else is being screwed up.  I don't want to discover years 
from now that my old system is dead and buried, and the TAR files it produced are worthless ...   
Maybe the same bug is also causing other problems elsewhere.  I just can't be sure !

Regards,
Jason




reply via email to

[Prev in Thread] Current Thread [Next in Thread]