[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
bug#49952: new snapshot available: gzip-1.10.34-aa73
From: |
Adler, Mark |
Subject: |
bug#49952: new snapshot available: gzip-1.10.34-aa73 |
Date: |
Tue, 10 Aug 2021 16:09:06 +0000 |
Seems like too extreme of a behavior change. If this proposal is implemented,
then 99.9% of the time, pigz -l will decode the entire file. If someone is
regularly doing pigz -l on a large number of files, the time it would take
would go up orders of magnitude.
The fundamental dilemma is that gzip -l and pigz -l give the correct answer
nearly all of the time, and is extremely fast. So “-l" still seems useful.
> On Aug 9, 2021, at 2:41 PM, Paul Eggert <eggert@CS.UCLA.EDU> wrote:
>
> On 8/9/21 8:19 AM, Adler, Mark wrote:
>> pigz -l doesn’t do that, but pigz -lt does. Since -t has to decode the whole
>> file, -l combined with it will use that information to give the correct
>> result. For compatibility, pigz -l still does what gzip does, which is to
>> guess based on what it finds at the end of the file.
>
> Perhaps gzip -l could do bounded work, as follows:
>
> * Look at the header to see what its byte count B says.
>
> * Decompress until it sees more than B bytes.
>
> * If so, report that the -l sizes are bogus. If not, carry on as before.
>
> Due to format limits, B can be at most 2**32 - 1, so this provides a bound on
> the amount of work, a bound that's reasonably small nowadays.