[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: 'sort' command -- support more than one field-delimiter

From: Dragan Simic
Subject: Re: 'sort' command -- support more than one field-delimiter
Date: Mon, 16 Oct 2023 00:05:34 +0200

On 2023-10-16 00:01, Pádraig Brady wrote:
On 15/10/2023 14:28, Dani Moncayo wrote:
Consider a file like [1].
(Note that, in each line, the dates are in (month/day/year) format,
_without_ zero-padding)

I want to sort that file by date.

After looking at sort's documentation (and also searching a bit on the
internet), I fail to see a way to achieve that sorting, by just using
the "sort" command.

It seems that one can specify just _one_ character as field separator.
I was expecting to be able to specify more than one. Something like:
$ sort -n -t' /' -k5 -k3 -k4 file1.txt

I think it would give the flexibility I was looking for here.

What do you think?

Yes it's a fair point, though a bit of an edge case,
i.e. a fixed number of fields separated by different delimiters.

The more general approach is to use the DSU pattern
(which is mentioned in the info manual), to adjust the data.
For your case you could do something like:

  sed 's/\(.*\) \([0-9/]\+\)$/\2 \1 \2/' < file1.txt |
  sort -t/ -n -k3,3 -k1,1 -k2,2 |
  cut -d' ' -f2-

Also if we were to implement this, then it might introduce confusion.
For example users might try to `sort -t "blah"` expecting
the word "blah" to be used as a delimiter.
This is similar to the python "".rstrip() string method,
which takes chars, but I've seen many many instances
of code doing things like filename.rstrip(".txt")
which is an insidious error as it often works.

A very similar kind of confusion happens more often than you'd think with the PHP's trim() function. [1]


reply via email to

[Prev in Thread] Current Thread [Next in Thread]