bug-coreutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#61696: Warn about sort --numeric-sort --unique data loss


From: Dan Jacobson
Subject: bug#61696: Warn about sort --numeric-sort --unique data loss
Date: Wed, 22 Feb 2023 09:26:21 +0800

At (info "(coreutils) sort invocation") it says
  For example, ‘sort -n -u’ inspects only the value of the initial
  numeric string when checking for uniqueness, whereas ‘sort -n | uniq’
  inspects the entire line. *Note uniq invocation::.

OK, but you still need to add a warning about data loss.

Here's a shell script:

k="3 Billy
17 Villy
4 Nibblesberg
3 Philbert
3 Billy"
c=sort
echo We sort the students [$c]
echo "$k" | $c
c="sort --numeric-sort"
echo Oh my gosh, we must use [$c]
echo "$k" | $c
c="sort --numeric-sort --unique"
echo Yuck, let\'s eliminate the duplicates too [$c]
echo "$k" | $c
echo Oops, we caused \"data loss\". Good thing we noticed it.
c="sort --unique" d="sort --numeric-sort"
echo Let\'s try it the right way: [$c \| $d]
echo "$k" | $c | $d

Running it shows:
We sort the students [sort]
17 Villy
3 Billy
3 Billy
3 Philbert
4 Nibblesberg
Oh my gosh, we must use [sort --numeric-sort]
3 Billy
3 Billy
3 Philbert
4 Nibblesberg
17 Villy
Yuck, let's eliminate the duplicates too [sort --numeric-sort --unique]
3 Billy
4 Nibblesberg
17 Villy
Oops, we caused "data loss". Good thing we noticed it.
Let's try it the right way: [sort --unique | sort --numeric-sort]
3 Billy
3 Philbert
4 Nibblesberg
17 Villy

Sure, you might say, "That's already mentioned" (in the fine print). "The
reader just needs to put 2 + 2 together in their heads." Yes, but
anyway, the document needs to drive home the point more.

Maybe the man page should say so too.





reply via email to

[Prev in Thread] Current Thread [Next in Thread]