[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
bug#61696: Warn about sort --numeric-sort --unique data loss
From: |
Dan Jacobson |
Subject: |
bug#61696: Warn about sort --numeric-sort --unique data loss |
Date: |
Wed, 22 Feb 2023 09:26:21 +0800 |
At (info "(coreutils) sort invocation") it says
For example, ‘sort -n -u’ inspects only the value of the initial
numeric string when checking for uniqueness, whereas ‘sort -n | uniq’
inspects the entire line. *Note uniq invocation::.
OK, but you still need to add a warning about data loss.
Here's a shell script:
k="3 Billy
17 Villy
4 Nibblesberg
3 Philbert
3 Billy"
c=sort
echo We sort the students [$c]
echo "$k" | $c
c="sort --numeric-sort"
echo Oh my gosh, we must use [$c]
echo "$k" | $c
c="sort --numeric-sort --unique"
echo Yuck, let\'s eliminate the duplicates too [$c]
echo "$k" | $c
echo Oops, we caused \"data loss\". Good thing we noticed it.
c="sort --unique" d="sort --numeric-sort"
echo Let\'s try it the right way: [$c \| $d]
echo "$k" | $c | $d
Running it shows:
We sort the students [sort]
17 Villy
3 Billy
3 Billy
3 Philbert
4 Nibblesberg
Oh my gosh, we must use [sort --numeric-sort]
3 Billy
3 Billy
3 Philbert
4 Nibblesberg
17 Villy
Yuck, let's eliminate the duplicates too [sort --numeric-sort --unique]
3 Billy
4 Nibblesberg
17 Villy
Oops, we caused "data loss". Good thing we noticed it.
Let's try it the right way: [sort --unique | sort --numeric-sort]
3 Billy
3 Philbert
4 Nibblesberg
17 Villy
Sure, you might say, "That's already mentioned" (in the fine print). "The
reader just needs to put 2 + 2 together in their heads." Yes, but
anyway, the document needs to drive home the point more.
Maybe the man page should say so too.
- bug#61696: Warn about sort --numeric-sort --unique data loss,
Dan Jacobson <=