[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#11220: uniq -d and -Du bug?

From: Eric Blake
Subject: bug#11220: uniq -d and -Du bug?
Date: Wed, 11 Apr 2012 06:07:30 -0600
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:11.0) Gecko/20120329 Thunderbird/11.0.1

tag 11220 notabug

On 04/10/2012 11:43 PM, phil colbourn wrote:
> What should this print?
> echo -e 'aa\naa\naa\n' | uniq -d

Thanks for the report.  POSIX requires this to print only a single
instance of 'aa', whether or not -d is in effect; coreutils does this by
outputting the last line in a series of duplicates.  The point of -d is
to suppress the single-line outputs that do not have a corresponding
duplicate input, not to output all instances of a duplicated line.

By the way, 'echo -e' is not portable; POSIX recommends you use printf

> Now, -D and -u means 'print all duplicate lines' and 'only print unique
> lines'.

-D is not specified by POSIX.  However, -u is defined by POSIX to
suppress output lines that have a corresponding duplicate input.

> I think this should print all lines since union of all unique lines and all
> duplicate lines is all lines.

> Therefore -Du prints first N-1 matching lines and not last matching line.

In isolation, uniq prints the last instance of the duplicated line, and
uniq -u suppresses the output of the 4th line.  In isolation, -D says to
output the first three lines which are normally omitted because they
have duplicates, in addition to the 4th line that is printed by default.
 So in combination, -Du says to print the lines with subsequent
duplicates (the first three lines) but to suppress the output line that
corresponds to the last input line that ends a sequence of duplicates
(the 4th line).

Perhaps we can document this behavior better.  Or perhaps we can change
the behavior of -D (but at risk of breaking existing clients that depend
on the current behavior).  But we can't change -u or -d behavior.

Put another way, per POSIX, the default behavior is subtractive (remove
any line with a subsequent duplicate), -d is subtractive (remove any
line with no duplicate), and -u is subtractive (remove any last line
that had a prior duplicate), and GNU -D is additive (print any line with
a subsequent duplicate, to counter the initial default).

> Are these bugs?

At this point, I will claim that the behavior is intended, and therefore
close out the bug.  But if you are willing to submit documentation
patches, or even code patches accompanied by extensive test cases to
demonstrate the corner cases of any new behavior, feel free to continue
to reply to this bug report.

Eric Blake   address@hidden    +1-919-301-3266
Libvirt virtualization library http://libvirt.org

Attachment: signature.asc
Description: OpenPGP digital signature

reply via email to

[Prev in Thread] Current Thread [Next in Thread]