[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Bug in Linux "sort" command??? (Red Hat V9)
From: |
Jim Meyering |
Subject: |
Re: Bug in Linux "sort" command??? (Red Hat V9) |
Date: |
Fri, 23 Apr 2004 14:01:40 +0200 |
Stuart Allsop <address@hidden> wrote:
> The SORT command in Red Hat 9 doesn't seem to work as advertised. I'm
> trying to use it to sort some mixed-up web logs by year, month and day.
>
> I'm using this:
>
> sort -f -s -d -M -k4.9,4.12n -k4.5,4.7M -k4.2,4.3n ./raw_log -o
> ./sorted_log
>
> The web log record format is as follows:
>
> 236-56-50.dial.terra.cl - - [02/Feb/2004:00:01:48 +0000] "GET /
> HTTP/1.1..."
Thanks for reporting that, but it's not a bug in the code.
I suppose it means the documentation could use another example
and a warning that this is not intuitive.
The problem is that sort's default idea of what makes up the Nth field
is a bit strange: it includes any leading white space.
If there's always just one SPACE or TAB between what would normally
be considered the 3rd and 4th fields, then you can simply
add 1 to each of your byte offsets. But that doesn't work
if some lines have two or more spaces.
A better solution is to use the `b' modifier to make sort
ignore leading blanks for each key specifier:
sort -f -s -d -k4.9b,4.12n -k4.5b,4.7M -k4.2b,4.3n
Note that specifying a `global' `-b' before the first
-k option doesn't have any effect, since such global
options do not affect `-k SPEC' options where the SPEC
uses modifiers like `M' and `n' in your example.
As for why your command worked with Irix,
it's probably because this part of the POSIX spec for sort
is a little bit tricky -- some might even say `ambiguous'.