coreutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: uniq with sort-like "--key" support


From: Assaf Gordon
Subject: Re: uniq with sort-like "--key" support
Date: Wed, 13 Feb 2013 11:45:26 -0500
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:10.0.4) Gecko/20120510 Icedove/10.0.4

>> On 02/12/2013 01:31 AM, Assaf Gordon wrote:
>>>
>>> I'd like to offer a proof-of-concept patch for adding sort-like "--key" 
>>> support for the 'uniq' program, as discussed here:
>>>    http://lists.gnu.org/archive/html/bug-coreutils/2006-06/msg00211.html
>>> and in several other threads.
>>>

One more update with two changes:

1. re-arranged "src/uniq_sort_common.h" to have the functions in the same order 
as in "src/sort.c",
making "diff src/uniq_sort_common.h src/sort.c" much easier to view (and seeing 
that the functions were not modified at all).

2. when specifying explicit field separator and using "-c", report the counts 
with no space-padding right-aligned numbers (and the separator).
This might be controversial, but I always needed that :) (used to wrap every 
"uniq -c" with "sed 's/^  *// ; s/ /\t/'" ) 
==
## Existing:
$ printf "a\tx\na\tx\nb\ty\n" | uniq -c
      2 a       x
      1 b       y

## New:
$ printf "a\tx\na\tx\nb\ty\n" | ./src/uniq -t $'\t' -c      
2       a       x
1       b       y
==


Also, I'm wondering what exactly is the effect of the following statement
( from http://lists.gnu.org/archive/html/bug-coreutils/2006-06/msg00217.html ):
  "This point was addressed in IEEE Std 1003.1-2001/Cor 1-2002, item
  XCU/TC1/D6/40, and it's why the current Posix spec says that the
  behavior of uniq depends on LC_COLLATE."

And whether sort's keycompare functions fulfill this requirement, and whether 
the current 'uniq' tests check this situation? 
Otherwise my changes are not backwards-compatible.

Thanks,
 -gordon



reply via email to

[Prev in Thread] Current Thread [Next in Thread]