[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#20745: I would like to make a request for the sort command

From: Pádraig Brady
Subject: bug#20745: I would like to make a request for the sort command
Date: Mon, 08 Jun 2015 16:38:42 +0100
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.6.0

On 08/06/15 10:51, Stephane Chazelas wrote:
> 2015-06-08 11:16:37 +0200, Erik Auerswald:
> [...]
>> FWIW I use 'sort' to sort IPv4 addresses in my ping_scan[1] script.
>> The info documentation for sort provides another example, log files
>> sorted by IP address and time stamp. That specific example even needs
>> two runs of sort, because sort lacks built-in support for IP addresses.
>> While IPv4 addresses are readily sorted by "sort -s -t '.' -k 1,1n -k
>> 2,2n -k 3,3n -k 4,4n", this is not the case for IPv6 addresses. Having
>> an option for sorting IP addresses that supports both IPv4 and IPv6
>> seems like a useful addition to me.
> [...]
> In the spirit of tools doing one thing and doing it well, it
> would make more sense to have a tool that converts an IP address
> to something sortable and use that instead in combination with
> sort.
> I'm not even sure having a tool just for that specific task
> would make sense though. Here, it sounds more like a job for a
> high level language like perl/python... (what if I want to sort
> on roman numerals now, week day names, astrological signs...)
> for instance, here using yash syntax (you can use named pipes or
> possibly coprocs with some other shells):
> ip2hex() {
>   perl -MSocket=:all -nle '
>     print unpack "(H2)*", inet_pton(/:/?AF_INET6:AF_INET, $_)'
> }
> mysort() {
>   (
>     exec 3>>|4
>     tee /dev/fd/3 |
>       cut -f1 3>&- | ip2hex 3>&- |
>       paste - /dev/fd/4 3>&-
>   ) | sort | cut -f2-
> }
> mysort << EOF
>       blah
>         foo
> ::1             bar
> That's still quite awkward. A shame that piping capabilities in
> shells don't extend to  more  complex scenarii where the output
> of some command can be piped to two others the output of which
> can be merged back easily.
> named pipes can be used for that, but cleaning up and
> restricting access to them makes their usage quite messy.
> Of course, the whole thing can be done with perl.

This is a useful example.
We're essentially talking about generalizing the
Decorate Sort Undecorate pattern here, which can be broken down to:


Your example above, generalizes the Decorate and Undecorate
parts using shell constructs, and you had a further suggestion
for pulling those internally to sort like:

  sort '-k1,1|ip2hex' '-k2,2n|roman2int' '-k3,3|iconv -t us//TRANSLIT'

Note we have similar kind of sub processing support in split for example:

  seq 10 20 | split -nr/$(nproc) --filter='rev'

If doing within sort(1), we'd have to read as well as write to the pipe,
and also for performance these filters would be used to process the input
before passing to the standard sort consumer functions.
Now having this within sort only provides for conciseness rather
than providing a functional advantage.

Another alternative would be to generalise the Decorate portion
of the process, which would be simpler as write only, and also
inherently distributed on multicore with a separate "decorate" process.
Doing this would also be of more general use than just for sort.
So we might have:

  decorate '-k1,1|ip2hex' '-k2,2n|roman2int' '-k3,3|iconv -t us//TRANSLIT'

Generally with "decorate", you would add a column rather than replacing,
though that would be controllable with options.

You could also do this whole decorate processing with something like
http://www.pixelbeat.org/scripts/funcpy or your perl -nle method.
and that would also support correlated operations like filtering.
Though that would require users to know the python/perl or whatever,
so there is some merit I think to having something like the
above decorate command.


reply via email to

[Prev in Thread] Current Thread [Next in Thread]