bug-tar
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Bug-tar] Fwd: [regression] tar mess up \ and \\ files


From: Nathan Stratton Treadway
Subject: Re: [Bug-tar] Fwd: [regression] tar mess up \ and \\ files
Date: Sun, 23 Mar 2014 14:47:38 -0400
User-agent: Mutt/1.5.20 (2009-06-14)

On Sun, Mar 23, 2014 at 11:23:09 +0100, Niels Thykier wrote:
> On Sat, 22 Mar 2014 at 13:43:31 -0400, Nathan Stratton Treadway wrote:

> The tool building the relevant tar file (known as the "data.tar") would
> be dpkg-deb.  I am not entirely sure, but I think [1] reflects the tar
> command-line it uses, which would be:
> 
>   execlp(TAR, "tar", "-cf", "-", "--format=gnu", "--null", "-T", "-",
>      "--no-recursion", NULL);
> 
> I guess that is where the (missing) "-T" appears that caused us headaches.

Yes, exactly.

(This is originally invoked somewhere as "dpkg-deb --build [DIR]", where
DIR points to the directory tree being populated by the "echo foo"
commands, right?)

> > Are you sure there was a change in behavior of the test between tar
> > 1.27 and 1.27.1?
> 
> Yes. We experienced a behaviour change in 1.27 and we adapted the code
> and tests accordingly (e.g. [2] and [3]).  Now we are seeing it change
> again with 1.27.1.  For reference, tar 1.27 landed in Debian unstable
> 2013-10-15.

Right, given that -T is indeed involved, this makes sense.


> 
> For reference, the test in [2] does have 3 files created via (Makefile
> syntax):
>         echo foo > debian/tmp/usr/share/doc/filenames/bokm<E5>l
>         echo foo > debian/tmp/usr/share/doc/filenames/bokm\\<E5>l
>         echo foo > debian/tmp/usr/share/doc/filenames/bokm\\\\<E5>l
> 

(If anyone else wants to find this, those lines are at:
   
http://anonscm.debian.org/gitweb/?p=lintian/lintian.git;a=blob;f=t/tests/legacy-filenames/debian/debian/rules#l95
)

 
> Accordingly, when I saw the test output change and compared it what the
> test was doing, I thought the test had been wrong all this time and tar
> finally fixed it. Namely, I would expect the above to create the
> following 3 files on the file-system
> 
>   O1 bokmål    (6 chars long)
>   O2 bokm\ål   (7 chars long)
>   O3 bokm\\ål  (8 chars long)

Right.
 
> If I understand the situation correctly, then these 3 files are passed
> to tar (with -T --null) via dpkg, causing them to be "unqouted".  This
> reduces the number of unique names to (per [4]):
> 
>   T1  bokmål     (6 chars long, T1 is O1 in the tarball)
>   T2  bokmål     (6 chars long, should have been O2, but in tar it is
>                   a hardlink of T1)
>   T3  bokm\ål    (7 chars long, T3 is O3 in the tarball)

Yes, exactly.
 
> T1, T2 and T3 is (as I understand you and other people on this mailing
> list) the correct, expected behaviour when the tarball is built as:
> 
>    tar -cf - --format=gnu -null -T - --no-recursion
> 

Well, I guess the point is that it _is_ the expected behavior of that
tar command, but it does not produce the "correct" result in this case.

That is, assuming that the "echo" commands in debian/rules  really do
produce the three unique files on the filesystem as expected, the "true"
failure here is that when dpkg-deb is generating a data.tar file that
does not accurately mirror the contents of the debian/tmp/... directory
tree.

By "luck" tar 1.27 generated an actually-correct "data.tar" archive,
thus tipping you off that your test cases were originally written to
expect the incorrect archive contents....

However, as Sergey explained earlier in this thread, the tar 1.27
unquoting behavior wasn't intended, and it was changed back to match
long-term behavior in 1.27.1.

> and if I want the paths named O1, O2 and O3 to appear in the tarball
> (rather than T1, T2 and T3), I need to either pass --no-unquote (not an
> option here) or add one more level of quoting?


So, I think that means that dpkg-deb has been (and is again) liable to
generate incorrect data.tar archives for packages that have unusual
filenames in them.  (It may be that this doesn't actually happen much
outside lintian test cases, but the point is that if any actual source
package did try to create have such files the resulting .deb wouldn't
contain the expected filenames...)

I don't know much about dpkg-deb's internal workings, but off hand I'd
say that it needs to either pass the --no-unquote option when spawning
tar or to quote the path names that it passes to the tar subprocess...

(If in fact the problem does occur during the "dpkg-deb --build" step, I
don't think there's any quoting that you can add yourself to correct the
problem...)

                                                Nathan 


----------------------------------------------------------------------------
Nathan Stratton Treadway  -  address@hidden  -  Mid-Atlantic region
Ray Ontko & Co.  -  Software consulting services  -   http://www.ontko.com/
 GPG Key: http://www.ontko.com/~nathanst/gpg_key.txt   ID: 1023D/ECFB6239
 Key fingerprint = 6AD8 485E 20B9 5C71 231C  0C32 15F3 ADCD ECFB 6239



reply via email to

[Prev in Thread] Current Thread [Next in Thread]