bug-coreutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#14224: Feature request for the `cut`: record delimiter


From: Pádraig Brady
Subject: bug#14224: Feature request for the `cut`: record delimiter
Date: Wed, 17 Apr 2013 18:13:49 -0700
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:17.0) Gecko/20130110 Thunderbird/17.0.2

On 04/17/2013 02:26 PM, George Brink wrote:
> Hello,
> 
> I have a task of extracting several "fields" from the text file. The
> standard `cut` tool could be a perfect tool for a job, but...
> In my file the '\n' character is a legal symbol inside fields and therefore
> the text file uses other symbol for record-separator. And the `cut` has a
> hard-coded '\n' for record separator (I just checked the source from the
> coreutils-8.21 package).

The patch would be simple but not without compatibility cost.
I.E. scripts using this would immediately become incompatible
with any systems without this feature.

So you'd like something like tac -s, --separator
However cut -s is taken, so we'd have to avoid the short -s at least.
Also tac -s takes a string rather than a character, so
that gives some extra credence (and complexity) to that option there.

Also related would be to support the -z, --zero-terminated option.
join, sort and uniq all have this option to use NUL as the record separator,
however they're all closely related sort dependent utilities
and we're trying to unify options between them.

If it is just a character you want to separate on,
then you can always use tr to convert before processing,
albeit with associated data copying overhead.

SEP=^
tr "$SEP"'\n' '\n'"$SEP" | cut ... | tr "$SEP"'\n' '\n'"$SEP"

So given that cut is not special here among the text filters,
and there is a workaround available, I'm 60:40 against
adding this feature.

thanks,
Pádraig.





reply via email to

[Prev in Thread] Current Thread [Next in Thread]