Hello,
I'd like to offer a proof-of-concept patch for adding sort-like "--key" support
for the 'uniq' program, as discussed here:
http://lists.gnu.org/archive/html/bug-coreutils/2006-06/msg00211.html
and in several other threads.
The patch involves few core changes:
1. All key-related functions were copied as-is from "sort.c", and put in a separate file
(uniq_sort_common.h). In theory, those could extracted later on to file that will be used by both sort
and uniq. At the moment, it's a hodge-podge of copy&paste, including code that's not relevant to
uniq (like "reverse").
2. The function "check_files" was modified to convert "struct linebuffer" (used by uniq)
to "struct line" (used by sort's functions)
and then
3. The "different" function was modified to call sort's "keycompare" function.
4. In main(), the key argument passing was copied from 'sort', and some code was added to
adapt previous options (e.g. skip-fields/skip-chars/check-chars) to internal "struct
keyfield" .
The result is that uniq can now do:
===
$ printf "A 1\nA 2\nB 2\n" | ./src/uniq -k1,1
A 1
B 2
$ printf "A 1\nA 2\nB 2\n" | ./src/uniq -k2,2
A 1
A 2
===
Most (but not all) of the existing tests pass.
New tests to demonstrate the new possibilities have been added to
'tests/misc/uniq-key.pl', try with:
make check TESTS=tests/misc/uniq-key SUBDIRS=.
I think that most of the keycomparison functions (like
numeric/general-numeric/month/version/skip-blanks) would "just work", though I
haven't tested it thoroughly yet.
Comments are welcomed,
-gordon