coreutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: coreutils feature requests?


From: Lance E Sloan
Subject: Re: coreutils feature requests?
Date: Wed, 19 Jul 2017 13:03:59 -0400

Hi, Eric.

Thank you for the thoughtful response.  I regret that I have trouble
understanding your point of view, though.  Please know that I do not mean
any disrespect.  I'd appreciate it if you could explain why you're opposed
to adding new features to cut (or to comm).

It may help if I explain my point of view.  I think that if the changes
I've described to cut and comm are considered useful by enough people, then
incorporating them into GNU coreutils is justified.  You're right, most
users won't see the new features overnight.  It takes time for the features
to reach their systems through updates, new OS installations, etc.
However, their delayed gratification doesn't mean we shouldn't attempt to
distribute these useful features.  If the lack of instantaneous utilization
were a true barrier, most software would never get (or never would have
gotten) any new features.

I agree, that for the moment, I may need to use another solution to get the
features I've suggested.  To accomplish the field reordering that cut
lacks, I'm trying a variety of other solutions.  I can use awk, as you
suggest.  I could also do it with bash, perl, python, php, sed, javascript,
java, C, etc.  Each solution has its benefits and drawbacks.

My considerations for a solution:

1.  I need this feature to process several files that have millions of
lines each.  I need to do this on an ongoing, periodic basis.  I can't
afford for the process to be slow.
2.  Since I have a large amount of data, I'm avoiding regular expressions
and interpreted languages, which take longer to complete the job.  That
eliminates awk and several other possible solutions.  A compiled C
application would be best.
3.  I don't need to write a special program for this purpose if one already
exists.  The cut utility already does most of the job.
4. Part of my data processing uses jq.  I've figured out how to do this
field reordering with it, but it makes my jq filter more complex and more
difficult for my successors to maintain.  As written on
https://stedolan.github.io/jq/ , "jq is like sed for JSON data".  I don't
consider sed to be a good solution for a problem of this size, so jq
probably isn't ideal, either.

Since a C implementation should run the fastest and cut from GNU's
coreutils is written in C and presumably doesn't need much work to support
this, it seems like the best solution.

Even if this feature suggestion isn't approved by the GNU community, I will
implement it for my own use anyway.  I can enjoy the new functionality
(which I think should have been added to cut long ago) and keep it to
myself or I can contribute it back to the online community.  I could
distribute it as my own fork of GNU coreutils or as a patch to it.
However, if it were merged into GNU's coreutils, it would get the most
exposure and be helpful to more people.

While I'm glad that several people on this mailing list like my suggestion,
I still don't know much about the procedure for getting it approved for
inclusion in GNU's coreutils.  I'd really like to know what needs to be
done.

With regard to your objection to a special environment variable: I agree.
I didn't feel strongly about it at first, but I was leaning towards not
implementing env. var. support for this.  It just didn't feel right.  I
have written programs that use env. var. to specify or override default
options.  However, your point about adding this to an existing, established
program where it could possibly cause variable conflicts is enough to set
my option a little more strongly against an env. var. for this purpose.

On 19 July 2017 at 10:15, Eric Blake <address@hidden> wrote:

> On 07/19/2017 08:29 AM, Nellis, Kenneth wrote:
> > From: Steeve McCauley
> >> I can't believe I'd never thought of reordering output columns like
> this.
> >> FWIW, I agree that another option should be used to prevent issues with
> backward compatibility.
>
> Unfortunately, it takes time to add an option, and then for that
> addition to percolate into the pre-built binaries of all the distros
> that you use.  In the meantime, you can already use awk to get the
> behavior you want today.  And since awk is already portable and required
> by POSIX to be able to reorder output fields, it's that much more of a
> burden to justify adding a feature to cut (adding a feature is easy if
> it is easy to show that no other existing tools can fill the gap, but a
> one-line awk program doesn't feel like no other tools being able to fill
> the gap).  Adding a new feature to GNU Coreutils is also easier to do if
> you can find someone else (like BSD) that already has the feature - but
> to my knowledge, no common 'cut -o' exists in other major distributions.
>


reply via email to

[Prev in Thread] Current Thread [Next in Thread]