bug-coreutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: MD5SUM: False Negative


From: Eric Blake
Subject: Re: MD5SUM: False Negative
Date: Wed, 25 Oct 2006 06:14:58 -0600
User-agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.0.7) Gecko/20060909 Thunderbird/1.5.0.7 Mnenhy/0.7.4.666

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Please keep replies on the list, so that others may see the conclusion of
this thread.

According to Gary Bartlett on 10/23/2006 10:42 AM:
> I ran across a posting that lists the same issue, if this helps:
> 
> http://sourceware.org/ml/cygwin/2001-06/msg00310.html

That post is quite old.  It predates when Paul made upstream changes in
text vs. binary handling in md5sum for coreutils 5.90.  So it is not
really relevant to this discussion.

> *From:* Gary Bartlett
> *Sent:* Monday, October 23, 2006 9:33 AM
> 
> Hi Eric,
> 
> Thanks for responding.  I tried running my version of MD5SUM with a
> --binary option, but this resulted in a completely different hash. 
> Also, running with --binary or --text isn't supported when running with
> the --check option.
> 

The --check option decides text vs. binary based on the presence or
absence of * before the filename, which in turn is determined by the mode
that the original file was read in when computing the checksum.  However,
I am very disappointed that * was chosen to mean binary, rather than text.
 It means that md5sums generated on Linux, then copying the sum and the
file to Windows in binary mode, then checking the sum on Windows, will try
to check the sum in text mode and fail, even though the sum was computed
in binary mode, all because the use of * is omitted on platforms that
don't distinguish between text and binary.  I would much rather see it
mean the exception to the rule, and only use * when computing a sum in
text mode, so that sum files become interoperable between machines when
using the default behavior of summing every byte in the respective files.

I am also hoping to get around to writing a patch to md5sum to encode \r
the same as \n is encoded, so that line endings in the sum file are no
longer ambiguous as to whether the file name contains a literal \r vs. the
line endings corrupted when copying the file across machines.

> It's strange, though, that the hash produced (by default, or with
> --binary) is the same hash, just with different case.  Do you think this
> is irrelevant for the the --check option?

Different case is a misfeature of Windows, with its case-insensitive,
case-preserving filesystem.  But it should not matter to the check option.

> 
> My version of MD5SUM is the latest available from CygWin.  I will try to
> find another distribution that has a newer COREUTILS and will try that.

Cygwin now provides pre-built coreutils 6.4 (I uploaded it last night).
When dealing with text vs. binary issues, perhaps a better list to ask is
the cygwin list, rather than here, since there are more experts on that
list when it comes to line ending issues.  It so happens that I am on both
lists, since I maintain the cygwin port of coreutils.

> 
> C:\>md5sum --version
> md5sum (GNU coreutils) 5.3.0

5.3.0 is old; it predates some of the upstream changes in 5.90.

- --
Life is short - so eat dessert first!

Eric Blake             address@hidden
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.2.1 (Cygwin)
Comment: Public key at home.comcast.net/~ericblake/eblake.gpg
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFFP1VC84KuGfSFAYARAkmMAJ9vCnA85w6nAfaplF7LbA6KY9722QCdFCWd
9FNt2GTGXSf7yuH89JjKyMA=
=Np0W
-----END PGP SIGNATURE-----




reply via email to

[Prev in Thread] Current Thread [Next in Thread]