coreutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: question about behavior of sort -n -t,


From: Pádraig Brady
Subject: Re: question about behavior of sort -n -t,
Date: Wed, 09 Oct 2013 00:34:10 +0100
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:17.0) Gecko/20130110 Thunderbird/17.0.2

On 10/09/2013 12:10 AM, Gabriel Gaster wrote:
> Thank you both very much for your detailed responses. I've responded below.
> 
> I would also like to ask that an example calling attention to this be put in 
> the man pages. While the warning to use LC_ALL=C that is already there is 
> helpful -- an explicit example that shows numeric sort acting differently 
> would shed more light on this.
> 
> --
> gabriel gaster
> 
> 
> On Tuesday, October 8, 2013, Pádraig Brady wrote:
>> Also note that while some of the sort funcionality is awkward,
>> it's done like that for backwards and cross compatibility reasons.
> 
> That makes sense. I suppose you mean in particular that sort relies on tables 
> specified by the locale?
> 
> On Tuesday, October 8, 2013, Eric Blake wrote:  
>> Yes, it's a FAQ:
>> https://www.gnu.org/software/coreutils/faq/coreutils-faq.html#Sort-does-not-sort-in-normal-order_0021
>> and sort is doing what POSIX behaves for your particular machine's
>> definitions of locales, and in turn their description of how collation
>> and numeric parsing will perform in that locale. Except for the C
>> locale, different vendors have tended to have different rules, even for
>>  
>> locales that are otherwise named the same.  
> 
> Thanks for the FAQ, which I found very helpful. Then the question in my mind 
> remains: if a user specifies a field-separator shouldn't that override the 
> locale? In this case, `en_US.UTF-8' allows the comma character in numerics, 
> however specifying that the comma character is a field-separator should mean 
> it does not allow the comma in numerics.
> 
> It seems that the locale overrides specific arguments to sort (in this case, 
> field-separator=, ). From the FAQ, I understand this might be necessarily so, 
> given how sort is implemented with reference to the locale tables. Still 
> though, why isn't sort faithful to an argument given to it?

You could have data like:

a,1,234
bb,4,321
c,1,111

If you wanted to sort by those grouped numbers you'd need to honor the , in the 
locale
and use sort -t, -k2n

Perhaps this is something worth warning about in sort --debug.
I.E. numeric specified and -t[$group|$decimal] specified,
but I'm inclined to think it's not worth it.

cheers,
Pádraig.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]