[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: non-deterministic compression for CREDITS.gz in libppl9 amd64 & kfre

From: Neil Williams
Subject: Re: non-deterministic compression for CREDITS.gz in libppl9 amd64 & kfreebsd-amd64
Date: Tue, 7 Feb 2012 00:02:46 +0000

On Mon, 06 Feb 2012 14:21:15 -0800
Paul Eggert <address@hidden> wrote:

> I can't reproduce the problem on x86-64 with vanilla
> gzip 1.4 and vanilla gzip 1.3.12.  So the problem appears to be
> either architecture-dependent, or it's a property of
> the Debian patches to gzip, or something like that, and
> I expect we'll need more information about how to
> reproduce the problem.  It looks like the problem is with
> 1.3.12-9 on armel so you might want to focus your attention
> there.

The broken CREDITS.gz was created with gzip 1.4 from Debian unstable. I
happened to use 1.3.12 to test on a different armel machine but the
whole problem with this bug is that it is non-deterministic and simply
repeating the compression can "fix" the apparent problem.

I added the extra information because the two versions of CREDITS.gz
are available via the packages specified, so rather than having to rely
on my own debug information, there is the opportunity to view/analyse
the actual .gz files involved in a situation where the checksums can be
checked and validated and the build logs exist so that the actual
version of gzip installed can be checked too.

gzip: already installed (1.4-1)

For comparison, the i386 build used the same version of gzip on the
same file and gave a different .gz file:
99e2b9f8972ce00cfe57e3735881015e  usr/share/doc/libppl9/CREDITS.gz
0e52e84eebf41588865742edaff7b3c0  usr/share/doc/libppl9/CREDITS.gz

i386 log:

More examples may well turn up soon as more people install the
MultiArch-aware version of dpkg which allows packages to be alongside
each other. This assumes and requires that files compressed on one
architecture are the same as the same file compressed on a different
architecture. It is quite possible that the bug in gzip is independent
of the architecture itself but that is how all of these issues are going
to show up.

Indeed, a quick check shows that this is not architecture-specific. The
kfreebsd-amd64 log shows that CREDITS.gz is a larger file than


6344 2011-02-27 09:07 ./usr/share/doc/libppl9/CREDITS.gz

6343 2011-02-27 09:07 ./usr/share/doc/libppl9/CREDITS.gz


0e52e84eebf41588865742edaff7b3c0  usr/share/doc/libppl9/CREDITS.gz

Same as armel but different to armhf, i386 and amd64.

I see no reason why a change of kernel or of gcc compiler flags for the
same version of gzip (all 1.4) would cause such non-deterministic
results from using gzip -9n

There is something else going on here, something internal to gzip which
is changing certain bytes inside the compressed file - in the same
manner. It is strange indeed for four separate machines to produce two
matching pairs of the same discrepancy when running the same code.

Ignore my tests with an older version of gzip - these results are all
with gzip 1.4-1. It doesn't matter if I decompress/recompress on amd64
or armel, the discrepancy goes away. The problem is that we cannot
anticipate when the discrepancy will occur, leading to packages failing
to install in random and unpredictable patterns.

This bug is going to be hard to reproduce but the results of it are
neither architecture dependent nor version dependent. Interestingly,
other text files in the same package, compressed on the same machine,
using the same options to gzip, do not differ. It's the peculiar
requirements of MultiArch which have brought this to light and in the
majority of cases the results of gzip -9n on the same file are
identical - but not always.


Neil Williams

Attachment: pgpv5rCRubrR9.pgp
Description: PGP signature

reply via email to

[Prev in Thread] Current Thread [Next in Thread]