coreutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: question about behavior of sort -n -t,


From: Gabriel Gaster
Subject: Re: question about behavior of sort -n -t,
Date: Wed, 9 Oct 2013 16:06:22 -0500

On Tuesday, October 8, 2013 at 8:48 PM, Eric Blake wrote:

> 

> >  the question in my mind remains: if a user specifies a
> >  field-separator shouldn't that override the locale?
> > 
> 
> No, because POSIX requires that -n parse as many characters as
> possible regardless of locale, unless you explicitly ask to limit
> the sort to a specific key.


That's interesting. Could you perhaps point me to that section (if you
know it off the top of your head)? The POSIX requirement that -n parse
as many characters regardless of locale seems to directly
contradict the other requirement (that at least made sense to me)
that you mentioned earlier that -n parse as many characters until
it sees a non numeric (which is locale dependent).

> Perhaps less likely to be used in real life, but still apropos to
> the example:
> $ printf '1202\n2011\n' | LC_ALL=C sort --debug -t0 -s -n -k1,1
> sort: using simple byte comparison 2011 _ 1202 __
> $ printf '1202\n2011\n' | LC_ALL=C sort --debug -t0 -s -n sort:
> using simple byte comparison 1202 ____ 2011 ____
> And you'll get the same behavior on Solaris or BSD sort (at least,
> assuming they don't have blatant POSIX compliance bugs). Once you
> understand WHY the above example has two different sorts, based on
> whether -k is used, you'll understand why we can't stop parsing -n
> at a comma even for -t, in a non-C locale.
> 

I understand why the above examples give two different sorts right
now. I just think that, in your example, -t0 should mean that 0 is no longer
a numeric character but a field-separator (regardless of locale) and 
therefore that sort should stop on the first line at 2. In other words,
sort -t0 -n should output '2011\n1202' since 2 is smaller than
12. It seems that the current rationale is to have the locale
override user specified field-separators, and to then have some
other POSIX requirement (that sort -n take as much as possible, regardless 
of locales and depending on locales), overiding locales sometimes.

> 

> > It seems that the locale overrides specific arguments to sort (in
> > this case, field-separator=, ).
> > 
> 
> Rather, the lack of -k determines how far -n will parse, regardless
> of locale; it's just that some locales let -n parse farther than
> others.
> -- Eric Blake eblake redhat com +1-919-301-3266 Libvirt
> virtualization library http://libvirt.org


Don't you actually mean here that "the lack of -k determines how far -n will
parse, depending on locale."




reply via email to

[Prev in Thread] Current Thread [Next in Thread]