[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: uniq: missing option -W / --check-fields=N
From: |
Jim Meyering |
Subject: |
Re: uniq: missing option -W / --check-fields=N |
Date: |
Tue, 27 Jun 2006 14:51:21 +0200 |
Pádraig Brady <address@hidden> wrote:
> Jim Meyering wrote:
>>
>> Hi Matt,
>>
>> I'm glad you're willing to work on this.
>> It's an often-requested feature.
>> Unfortunately, the Debian -W patch was not acceptable.
>> It did not allow the same flexibility that sort does in
>> selecting keys. To provide that, GNU uniq will eventually
>> accept at least the following options, just as sort does:
>>
>> -k, --key=POS1[,POS2] start a key at POS1, end it at POS2 (origin 1)
>> -t, --field-separator=SEP use SEP instead of non-blank to blank transition
>> -z, --zero-terminated end lines with 0 byte, not newline
>>
>> and even most, if not all, of these (for flexibility/interoperability
>> with sort, as well as to ease code sharing between uniq and sort):
>>
>> -b, --ignore-leading-blanks ignore leading blanks
>> -d, --dictionary-order consider only blanks and alphanumeric
>> characters
>> -i, --ignore-nonprinting consider only printable characters
>
> agreed
>
>> -f, --ignore-case fold lower case to upper case characters
>
> It has this already. See below.
>
>> -g, --general-numeric-sort compare according to general numerical value
>> -M, --month-sort compare (unknown) < `JAN' < ... < `DEC'
>> -n, --numeric-sort compare according to string numerical value
>> -r, --reverse reverse the result of comparisons
>
> These 4 deal with specific order which I don't think uniq should worry about?
You're right about --reverse. Thanks.
However, the others change sort's idea of which values are equal,
so they are relevant. For -g, 0.0 == 0 == 00, etc.
For -M, FEB == feb == Feb, etc.
For -n, 00 == 0.
The idea is to be able to use uniq with the same keyspec options
as you used when sorting the data.
That means the command-line options listed above as well as the
key spec modifier options like b, d, g, M etc. used e.g., in -k 1b,1 -k 2n.
> uniq can be efficient and assume LANG=C always as
> it need only care if adjacent items match or not.
> Assuming LANG=C may be an issue for --ignore-case though?
> However I notice v5.2.1 at least only seems to handle ascii:
>
> $ LANG=ga_IE.utf8 uniq -i < Pádraig
> Pádraig
> PÁdraig
Yes, that's still a problem.
Would you like to work on it?
- Re: uniq: missing option -W / --check-fields=N, (continued)
- Re: uniq: missing option -W / --check-fields=N, Eric Blake, 2006/06/21
- Re: uniq: missing option -W / --check-fields=N, Matt Keenan, 2006/06/22
- Re: uniq: missing option -W / --check-fields=N, Paul Eggert, 2006/06/22
- Re: uniq: missing option -W / --check-fields=N, Matt Keenan, 2006/06/23
- Re: uniq: missing option -W / --check-fields=N, Pádraig Brady, 2006/06/26
- Re: uniq: missing option -W / --check-fields=N, Paul Eggert, 2006/06/26
- Re: uniq: missing option -W / --check-fields=N, Jim Meyering, 2006/06/26
- Re: uniq: missing option -W / --check-fields=N, Matt Keenan, 2006/06/26
- Re: uniq: missing option -W / --check-fields=N, Jim Meyering, 2006/06/27
- Re: uniq: missing option -W / --check-fields=N, Pádraig Brady, 2006/06/27
- Re: uniq: missing option -W / --check-fields=N,
Jim Meyering <=
- Re: uniq: missing option -W / --check-fields=N, Pádraig Brady, 2006/06/27
- Re: uniq: missing option -W / --check-fields=N, Jim Meyering, 2006/06/27
- Re: uniq: missing option -W / --check-fields=N, Paul Eggert, 2006/06/27