bug-coreutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Sort order bug in GNU sort


From: Eric Blake
Subject: Re: Sort order bug in GNU sort
Date: Thu, 29 Oct 2009 18:51:02 -0600
User-agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.23) Gecko/20090812 Thunderbird/2.0.0.23 Mnenhy/0.7.6.666

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

[please don't top-post on technical lists]

According to Luke Hutchison on 10/29/2009 6:43 PM:
> Hi Pádraig,
> As stated, "The following is the output of GNU sort (without any
> switches)" -- i.e. I used the defaults, and did not specify any
> commandline switches.  If as you say, by default the whole line is the
> sort key, and if default sorting is lexicographic order, how are the
> following snippets from the sorted output possibly correct?
> 
> sampleId-1010,0.0625
> sampleId-101,0.0625
> sampleId-1010,1.0

Well, that looks correct to me, if your current locale specifies that
punctuation is ignored during collation (that is, you are getting: 101000
< 101006 < 101010, after ignoring , and .).

http://www.gnu.org/software/coreutils/faq/coreutils-faq.html#Sort-does-not-sort-in-normal-order_0021

Try 'LC_ALL=C sort' to see the difference.

> Even if in some weird locale, ',' > '0', or some other weird thing
> were true, the two lines "sampleId-1010,0.0625" and
> "sampleId-1010,1.0" should be grouped together either before or after
> "sampleId-101,0.0625", because they share a common prefix

Nope.  And the locale is not that weird.  Many locales ignore punctuation
during collation, in order to get dictionary sorting (rather than
byte-wise prefix sorting).

- --
Don't work too hard, make some time for fun as well!

Eric Blake             address@hidden
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (Cygwin)
Comment: Public key at home.comcast.net/~ericblake/eblake.gpg
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAkrqOHYACgkQ84KuGfSFAYCNhQCeMHBDVREcrM+QAlsSRJRGSTkd
3lYAoIIbWaNZvleYo1jKDoDfQ1mpi5aE
=Uhof
-----END PGP SIGNATURE-----




reply via email to

[Prev in Thread] Current Thread [Next in Thread]