coreutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Extend uniq to support unsorted list based on hashtable


From: Bob Proulx
Subject: Re: Extend uniq to support unsorted list based on hashtable
Date: Mon, 1 Jun 2020 17:29:40 -0600

Yair Lenga wrote:
> For the first point, I would note that most coreutils goes well beyond
> POSIX. Consider "cp", which has many useful additions beyond the POSIX
> features.

Most of those additions were due to file systems with new features and
therefore cp needed to be able to deal with those features.  ACLs and
extended attributes and other things.  There was no other way to deal
with them.  (However I don't think I have ever had reason to use the
--strip-trailing-slashes option.)

> The second point is about availability of other tools to achieve
> similar task. This is a "judgement call where this functionality
> belong. There is no single right answer here. Such implementation can
> be done with few lines of code in any scripting solution

If a task can be done with a small combination of utilities then that
small combination of utilities is usually the right way to do it.
Because otherwise instead of a small set of utilities that work
together the result is many very large utilities each of which does
everything.

> My main point is that given that the very common use case for 'uniq'
> is combined with other coreutils functions (sort, cut, sed), it make
> sense to have an efficient implementation for "counting unique
> values" available within "coreutils", instead of sending the user to
> look for a solution elsewhere, or to implement his own.

Some years ago a programming challenge involved Donald Knuth and Doug
McIlroy and has become somewhat famous.  I highly recommend studying
this example.  Here is an article that discusses the event.

  http://www.leancrew.com/all-this/2011/12/more-shell-less-egg/

It's a very educational lesson that we might learn from those
programming greats!

Bob



reply via email to

[Prev in Thread] Current Thread [Next in Thread]