[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Bug-tar] Multiple path headers mixing sparse and xattrs

From: Dominique Martinet
Subject: Re: [Bug-tar] Multiple path headers mixing sparse and xattrs
Date: Thu, 23 Jun 2016 15:35:51 +0200
User-agent: Mutt/1.5.23 (2014-03-12)


Dominique Martinet wrote on Thu, Jun 09, 2016 at 01:22:53PM +0200:
> (For archive digging purpose, this looks a lot like
> http://lists.gnu.org/archive/html/bug-tar/2010-11/msg00095.html ; except
> that the file name must contain utf8/non-valid ASCII component)
> We've noticed the extracted path for some file is wrong IF both --sparse
> and --xattrs is used AND the file is sparse and its path contains some
> "weird" characters.
> Here's a full reproducer, ran it on today's git master branch:
> $ cd $(mktemp -d)
> $ mkdir -p t
> $ dd if=/dev/urandom of=t/barbarbar bs=1M seek=50 count=1
> $ cp t/barbarbar t/mumuµmu
> $ tar --xattrs -S -c t | tar -t
> t/
> t/barbarbar
> t/GNUSparseFile.6221/mumuµmu
> I'm just listing here, but it would be extracted as such as well.
> Looking at the binary tar, the problem is that the path is listed twice
> for mumuµmu:
> 30 GNU.sparse.name=t/mumuµmu
> ...
> 38 path=t/GNUSparseFile.6236/mumuµmu
> (while barbarbar only has GNU.sparse.name, and no path attribute)
> For now I've just quick & dirty patched my own src/xheader.c path_decode
> function to take the first path because it seems to work™ and we're in a
> bit of a hurry;
> another workaround as given in the mail I quoted at start would be to
> use --sparse-version=0
> I guess the main fix should be to only output the header once though;
> looking at the code (src/create.c, write_header_name), it seems that we
> explicitely check !string_ascii_p (st->file_name) and write the extra
> header then.
> I'm not quite sure how to cleanly check that we already wrote the
> filename in another attribute then...
> (Thinking back we might want to handle retro-compatibility and handle
> archives made with existing tar versions over changing the way we code
> output; so maybe always preferring GNU.sparse.name over path without
> relying on order would be a better solution ?)

Does anyone have an opinion on this ?
Would you take a patch if I went through the trouble of implementing
either solution ?

I don't really care on which solution to implement and both look
possible to do (either not writing improper path in output tar or
ignoring path if GNU.sparse.name is set on extracting); but I'd rather
not pick one and be told "no we prefer the other one" after not getting
any feedback... Or just being plain ignored.

Thank you,
Dominique Martinet

reply via email to

[Prev in Thread] Current Thread [Next in Thread]