bug-coreutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#18168: Bug in "sort -V" ?


From: Assaf Gordon
Subject: bug#18168: Bug in "sort -V" ?
Date: Tue, 6 Nov 2018 11:48:07 -0700
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.2.1

tags 18168 notabug
close 18168
stop

(triaging old bugs)

Hello,

It seems your message was lost and not replied to in 4 years.
Sorry about that.

On 2014-08-01 3:38 a.m., Schleusener, Jens wrote:
I am not sure if it's a bug or not but for my application cases the "sort" command with use of the very helpful option "-V" (natural sort of (version) numbers within text) not always delivers the by me expected output.

Note that "-V/--version" is specifically sorting by Debian's *version*
sorting rules. It might seem like it's the same as "natural sort", but
it is not.

The exact rules are here:
https://www.debian.org/doc/debian-policy/ch-controlfields.html#version
https://readme.phys.ethz.ch/documentation/debian_version_numbers/


Example input file (with four test cases):

1.0.5_src.tar.gz
1.0_src.tar.gz
2.0.5src.tar.gz
2.0src.tar.gz
3.0.5/
3.0/
4.0.5beta/
4.0beta/

Sorted ("sort -V") output file (with errors?):

1.0.5_src.tar.gz
1.0_src.tar.gz
2.0src.tar.gz
2.0.5src.tar.gz
3.0.5/
3.0/
4.0beta/
4.0.5beta/

By me expected output file:

1.0_src.tar.gz
1.0.5_src.tar.gz
2.0src.tar.gz
2.0.5src.tar.gz
3.0/
3.0.5/
4.0beta/
4.0.5beta/

The disagreement is about "1.0_src.tar.gz" vs "1.0.5_src.tar.gz"
and "3.0/" vs "3.0.5/" .

Note that these characters are not strictly valid characters in debian
version strings.

Let's try to compare them using Debian's own tools:

First, define a tiny shell function to help compare strings:

    compver() {
       dpkg --compare-versions "$1" lt "$2" \
            && printf "%s\n" "$1" "$2" \
            || printf "%s\n" "$2" "$1"
    }

Then, compare the values:

  $ compver 1.0.5_src.tar.gz 1.0_src.tar.gz
dpkg: warning: version '1.0.5_src.tar.gz' has bad syntax: invalid character in version number dpkg: warning: version '1.0_src.tar.gz' has bad syntax: invalid character in version number
  1.0.5_src.tar.gz
  1.0_src.tar.gz

  $ compver 3.0/ 3.0.5/
dpkg: warning: version '3.0/' has bad syntax: invalid character in version number dpkg: warning: version '3.0.5/' has bad syntax: invalid character in version number
  3.0.5/
  3.0/

So sort's order agrees with Debian's ordering rules.
It might not be what a "natural sort" algorithm would do, but version-sort
is not exactly natural-sort.

Another detailed example of a version-sort is here:
https://debbugs.gnu.org/cgi/bugreport.cgi?bug=22275


As such, I'm closing this bug.
Discussion can continue by replying to this thread.

-assaf





reply via email to

[Prev in Thread] Current Thread [Next in Thread]