bug-gzip
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH] Unzip fails without optional extended local header signature


From: Michael Gray
Subject: Re: [PATCH] Unzip fails without optional extended local header signature
Date: Mon, 30 Jan 2012 13:18:01 -0800

On Wed, Nov 30, 2011 at 1:57 AM, Jim Meyering <address@hidden> wrote:
> Michael Gray wrote:
>> I believe I've found a bug regarding the decompression of single-entry
>> .zip files.
>>
>> As per the Section V.C of the .ZIP File Format Specification
>> (http://www.pkware.com/documents/casestudies/APPNOTE.TXT), data
>> descriptors (called the extended local header in the gzip source) *may
>> or may not* be preceded by a signature. Gzip always assumes this
>> signature is present; if it is not, it reads the CRC and length values
>> 4 bytes further into the file than it should, and the CRC and length
>> checks fail even though the file is not corrupt.
>>
>> I've included a patch that works around the problem. First, it assumes
>> that the signature *is not present*, as it's possible that the
>> signature value is also a valid CRC, and no non-corrupt file should be
>> rejected if it's CRC just happens to match the signature value. If the
>> CRC or length check fails assuming the signature is not present, the
>> signature is then checked for. If present, 4 more bytes of input are
>> read, and the previously read values are shifted appropriately. The
>> CRC and length checks then proceed as normal.
>>
>> Below is the text of the relevant text from the .ZIP spec:
>> "
>>
>>       Although not originally assigned a signature, the value
>>       0x08074b50 has commonly been adopted as a signature value
>>       for the data descriptor record.  Implementers should be
>>       aware that ZIP files may be encountered with or without this
>>       signature marking data descriptors and should account for
>>       either case when reading ZIP files to ensure compatibility.
>>       When writing ZIP files, it is recommended to include the
>>       signature value marking the data descriptor record.  When
>>       the signature is used, the fields currently defined for
>>       the data descriptor record will immediately follow the
>>       signature.
>>
>> "
>>
>> -- Michael Gray
>>
>> address@hidden
>
> Thank you for the analysis and patch.
> However, I haven't looked at it yet, in case I have to
> rewrite it based solely on your description.
>
> Can you point to tools that produce ZIP files without that signature?
>
> Assuming that we find a few that are still in non-trivial use,
> we'll need a copyright assignment, since your patch is large enough to
> require that.  Can you sign one?  If so, here are some details:
> [that link is for the coreutils package, but it's the same policy for gzip]
>
>    http://git.sv.gnu.org/cgit/coreutils.git/tree/HACKING#n444
>
> Note: if you're in the US, you should be able to fax the signed form
> rather than using actual stamp and envelope.
>
> Jim

Apparently my emails to the copyright clerk ended up in the wrong
queue and got delayed for two months; I received the copyright
assignment form, signed, and submitted it just now.

-- Michael



reply via email to

[Prev in Thread] Current Thread [Next in Thread]