[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
bug#8040: join 5.97 bug
From: |
Batson, Brannon |
Subject: |
bug#8040: join 5.97 bug |
Date: |
Mon, 14 Feb 2011 19:22:13 -0500 |
Sorry, when I said 'join -v 1 d c', I meant 'join -v 1 b a'. The only files
involved are a & b which I sent contents for.
The fundamental problem is that join and sort disagree on what 'sorted' means
in this case.
Brannon
________________________________________
From: Eric Blake address@hidden
Sent: Monday, February 14, 2011 7:18 PM
To: Batson, Brannon
Cc: address@hidden
Subject: Re: bug#8040: join 5.97 bug
On 02/14/2011 04:45 PM, Batson, Brannon wrote:
>
> File a:
> 10 A
> 1 B
>
> File b:
> 1
>
> $ join b a
> <nada>
>
> $ join -v 1 d c
> 1
You didn't provide a file d or c to compare against.
>
> files a & b are both sorted lexicographically (according to 'sort', anyway).
> The problem is that the join lexicographic '<' operator disagrees with sort's.
Thanks for the report. I can't help but wonder if you've stumbled into
this:
http://www.gnu.org/software/coreutils/faq/#join-requires-sorted-input-files
At any rate, the only bug here is in your input files.
>
> Sorry if this bug has been found like a thousand times before, couldn't find
> it via 30s of googling.
Coreutils 5.97 is OLD. The latest stable release is 8.10, and it has
improved diagnostics for helping you discover sorting problems with join:
join --help reminds you that:
Important: FILE1 and FILE2 must be sorted on the join fields.
E.g., use ` sort -k 1b,1 ' if `join' has no options,
or use ` join -t '' ' if `sort' has no options.
And trying your example with LC_ALL=en_US.UTF-8 gives:
$ join b a
join: file 2 is not in sorted order
Sure enough, using sort --debug to find the culprit (a was not sorted
according to -k 1b,1):
$ sort --debug a
sort: using `en_US.UTF-8' sorting rules
10 A
____
1 B
____
$ sort --debug -k 1b,1 a
sort: using `en_US.UTF-8' sorting rules
1 B
_
____
10 A
__
____
--
Eric Blake address@hidden +1-801-349-2682
Libvirt virtualization library http://libvirt.org