bug-coreutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Addendum: Possible Bug in comm ?


From: Joseph A. Wiencko, Jr.
Subject: Re: Addendum: Possible Bug in comm ?
Date: Wed, 13 Sep 2006 23:57:50 -0400
User-agent: Mozilla Thunderbird 1.0.7 (Windows/20050923)

Hello Eric Blake,

Thank you for the quick response!

I do include the LC_ALL statement in my shell scripts (although not often when I'm using the command line, where it comes up sometimes).

But, regardless of the collating sequence of ANY non-POSIX locale, why would it EVER be the normal behavior of comm be to say, in the example I posed, that the underscore character is in the first file but not the second (comm -23), and simultaneously say it is in the second but not the first (comm -13), and simultaneously say that it is not in both files (comm -12) ? This appears to be a logical bug in comm, regardless of the collating sequence.

Another way of saying it is: "there exists no collating sequence in which this should be the behavior of comm -- it is contrary to the basic, stated behavior of comm REGARDLESS OF THE COLLATING SEQUENCE".

Another way of saying it, in a question format, is "Is there ANY collating sequence in which the example I posed makes sense for the behavior of comm? If so, please name one."

This logical error, regardless of collating sequence, is why it appears to be a bug.

Thanks.

-- Joseph A. Wiencko, Jr.
address@hidden

Eric Blake wrote:

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

According to Joseph A. Wiencko, Jr. on 9/13/2006 9:05 PM:
Hello again,

The expected behavior (instead of the anomalous behavior) occurs in a
bash script when the following statement is included:

export LC_ALL=POSIX

But why would the other behavior ever be desired, even if LC_ALL=POSIX
is not specified?

Because non-POSIX locales have different collating sequences, and POSIX
requires comm (and other utilities) to respect locales.  This has its
benefits when you know to expect it, but any seasoned shell programmer
will tell you that LC_ALL=POSIX (or shorter, LC_ALL=C) is one of the
things they do at startup to sanitize the shell script.

comm (coreutils) 5.2.1
Linux 2.6.9-22.0.1.ELsmp #1 SMP Thu Oct 27 14:49:37 CDT 2005 x86_64
x86_64 x86_64 GNU/Linux
Report bugs to <address@hidden>

You are probably due for an upgrade.  The latest stable version of
coreutils is 5.97, and beta 6.1 is also available.

- --
Life is short - so eat dessert first!

Eric Blake             address@hidden
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.2.1 (Cygwin)
Comment: Public key at home.comcast.net/~ericblake/eblake.gpg
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFFCM1J84KuGfSFAYARAhqhAJ9W9N+FQj1E2yaatp3zS/bTMR1LXACdHiAt
dTmYNc3UeaW89/+S8PaDoQs=
=6IG1
-----END PGP SIGNATURE-----





reply via email to

[Prev in Thread] Current Thread [Next in Thread]