[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: question about behavior of sort -n -t,
From: |
Pádraig Brady |
Subject: |
Re: question about behavior of sort -n -t, |
Date: |
Wed, 09 Oct 2013 00:34:10 +0100 |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:17.0) Gecko/20130110 Thunderbird/17.0.2 |
On 10/09/2013 12:10 AM, Gabriel Gaster wrote:
> Thank you both very much for your detailed responses. I've responded below.
>
> I would also like to ask that an example calling attention to this be put in
> the man pages. While the warning to use LC_ALL=C that is already there is
> helpful -- an explicit example that shows numeric sort acting differently
> would shed more light on this.
>
> --
> gabriel gaster
>
>
> On Tuesday, October 8, 2013, Pádraig Brady wrote:
>> Also note that while some of the sort funcionality is awkward,
>> it's done like that for backwards and cross compatibility reasons.
>
> That makes sense. I suppose you mean in particular that sort relies on tables
> specified by the locale?
>
> On Tuesday, October 8, 2013, Eric Blake wrote:
>> Yes, it's a FAQ:
>> https://www.gnu.org/software/coreutils/faq/coreutils-faq.html#Sort-does-not-sort-in-normal-order_0021
>> and sort is doing what POSIX behaves for your particular machine's
>> definitions of locales, and in turn their description of how collation
>> and numeric parsing will perform in that locale. Except for the C
>> locale, different vendors have tended to have different rules, even for
>>
>> locales that are otherwise named the same.
>
> Thanks for the FAQ, which I found very helpful. Then the question in my mind
> remains: if a user specifies a field-separator shouldn't that override the
> locale? In this case, `en_US.UTF-8' allows the comma character in numerics,
> however specifying that the comma character is a field-separator should mean
> it does not allow the comma in numerics.
>
> It seems that the locale overrides specific arguments to sort (in this case,
> field-separator=, ). From the FAQ, I understand this might be necessarily so,
> given how sort is implemented with reference to the locale tables. Still
> though, why isn't sort faithful to an argument given to it?
You could have data like:
a,1,234
bb,4,321
c,1,111
If you wanted to sort by those grouped numbers you'd need to honor the , in the
locale
and use sort -t, -k2n
Perhaps this is something worth warning about in sort --debug.
I.E. numeric specified and -t[$group|$decimal] specified,
but I'm inclined to think it's not worth it.
cheers,
Pádraig.
- question about behavior of sort -n -t,, Gabriel Gaster, 2013/10/08
- Re: question about behavior of sort -n -t,, Eric Blake, 2013/10/08
- Re: question about behavior of sort -n -t,, Pádraig Brady, 2013/10/08
- Re: question about behavior of sort -n -t,, Eric Blake, 2013/10/08
- Re: question about behavior of sort -n -t,, Gabriel Gaster, 2013/10/09
- Re: question about behavior of sort -n -t,, Gabriel Gaster, 2013/10/09
- Re: question about behavior of sort -n -t,, Eric Blake, 2013/10/09
- Re: question about behavior of sort -n -t,, Gabriel Gaster, 2013/10/09