[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: 'sort' bug
From: |
Bob Proulx |
Subject: |
Re: 'sort' bug |
Date: |
Fri, 30 May 2008 00:01:50 -0600 |
User-agent: |
Mutt/1.5.13 (2006-08-11) |
Mike Markowski wrote:
> I think I've come across a bug in 'sort'. Using the attached file (please
> let me know if the attachment is stripped from this email), I tried to sort
> on the 5th column of states/countries by using:
>
> sort -k 5 c3
>
> The first few lines look like:
>
> 10-Apr-2008 W7GVE 729C Ed AZ 10.120
> 18-May-2008 W1GUE 1998 Ed NH 7.055
> 28-Apr-2008 KG4W 2416T Ed VA 7.055
> 11-May-2008 K4ZGB 796T Tom AL 7.055
> 16-May-2008 9A2VJ 2533 Vel CROATIA 14.052
> [...]
>
> already not properly sorted by state/country.
I think you have missed that unless you specify -b that spaces are
part of each field.
`-t SEPARATOR'
`--field-separator=SEPARATOR'
Use character SEPARATOR as the field separator when finding the
sort keys in each line. By default, fields are separated by the
empty string between a non-blank character and a blank character.
That is, given the input line ` foo bar', `sort' breaks it into
fields ` foo' and ` bar'. The field separator is not considered
to be part of either the field preceding or the field following,
so with `sort -t " "' the same input line has three fields: an
empty field, `foo', and `bar'. However, fields that extend to the
end of the line, as `-k 2', or fields consisting of a range, as
`-k 2,3', retain the field separators present between the
endpoints of the range.
Therefore because "Ed" is one character shorter than "Tom" there is
one more space for those three lines than for all later lines. The
fields being sorted are:
" AZ 10.120"
" NH 7.055"
" VA 7.055"
" AL 7.055"
" CROATIA 14.052"
That should illustrate the issue. The resulting order is correct as
it has been specified. Also you probably want to end the sort string
as well. Because as you can see -k5 is sorting from there to the end
of line. Meaning that with -b -k5 you are sorting on these strings:
"AZ 10.120"
"NH 7.055"
"VA 7.055"
"AL 7.055"
"CROATIA 14.052"
But with -b -k5,5 you would be sorting upon these strings:
"AZ"
"NH"
"VA"
"AL"
"CROATIA"
That is probably what you want. Also you may or may not want the -s
option to disable the last-resort comparison of the entire line.
> Yet, doing:
>
> sort -n -k 6 c3
>
> works as expected.
Numeric sorting skips leading spaces just like -b does when doing a
character sort.
Bob
- 'sort' bug, Mike Markowski, 2008/05/29
- Re: 'sort' bug,
Bob Proulx <=