bug-tar
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Bug-tar] GNU tar generates malformed Pax attributes


From: Tim Kientzle
Subject: [Bug-tar] GNU tar generates malformed Pax attributes
Date: Sun, 8 Dec 2013 11:38:25 -0800

Pavel recently sent me an archive created with GNU tar
that includes SCHILY.xattr extensions.

bsdtar chokes on this because the Pax attributes are malformed.

Quoting from “IEEE Std 1003.1, 2013 Edition”
http://pubs.opengroup.org/onlinepubs/9699919799/utilities/pax.html

>  An extended header shall consist of one or more records, each constructed as 
> follows:
> 
> "%d %s=%s\n", <length>, <keyword>, <value>
> 
> The extended header records shall be encoded according to the ISO/IEC 
> 10646-1:2000
> standard UTF-8 encoding. The <length> field, <blank>, <equals-sign>, and 
> <newline>
> shown shall be limited to the portable character set, as encoded in UTF-8.

Here’s a partial hexdump from the file in question.  The
attribute in question starts with ’85’ at the end of the first
line:

00000340  73 65 72 2e 74 65 73 74  33 3d 61 68 6f 0a 38 35  |ser.test3=aho.85|
00000350  20 53 43 48 49 4c 59 2e  78 61 74 74 72 2e 73 79  | SCHILY.xattr.sy|
00000360  73 74 65 6d 2e 70 6f 73  69 78 5f 61 63 6c 5f 61  |stem.posix_acl_a|
00000370  63 63 65 73 73 3d 02 00  00 00 01 00 06 00 ff ff  |ccess=..........|
00000380  ff ff 02 00 07 00 0f 00  00 00 04 00 06 00 ff ff  |................|
00000390  ff ff 10 00 07 00 ff ff  ff ff 20 00 04 00 ff ff  |.......... .....|
000003a0  ff ff 0a 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|

In particular, it appears that GNU tar is storing raw binary
here.  It is most definitely NOT valid UTF-8.

I suppose I’ll have to rework libarchive’s pax parser to
tolerate this.  It would be nice if GNU tar could avoid
such brokenness in the future.

Cheers,

Tim

P.S.  FWIW, LIBARCHIVE.xattr records don’t have this
problem:  they URL-encode the attribute name to ensure
the name is “limited to the portable character set” and
they base-64 encode the contents of the extended attribute.




reply via email to

[Prev in Thread] Current Thread [Next in Thread]