[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: csplit feature request: allow a user to specify which pieces to outp
Re: csplit feature request: allow a user to specify which pieces to output
Wed, 12 Mar 2014 09:22:45 +0000
Mozilla/5.0 (X11; Linux x86_64; rv:17.0) Gecko/20130110 Thunderbird/17.0.2
On 03/12/2014 04:26 AM, Hong Yang wrote:
> If a user is splitting a large file into pieces only to use several of them,
> it will be efficient to just specify the indexes of files to output.
> Take an extreme case for example. "a_large_file" has 12823371193 lines.
> "csplit a_large_file 823371193 823371293" will have three output files: xx00
> with 823371192 lines, xx01 with 100 lines, and xx02 with the rest. If a user
> is only interested in xx01, it will be desirable to do "csplit a_large_file
> 823371193 823371293 -o 1."
So this is a borderline one.
Functionally it is useful as it can avoid redundant
processing and storage.
split(1) has a similar feature in that one can use -n K/N
to split the Kth item out of N. It doesn't support -o N,
to select an arbitrary chunk based on size, as that can be done
for single chunks with dd.
If we were to implement for csplit, we'd probably implement
for split also, and support specifying multiple (disjoint) chunks.
However given it is only a performance feature and I'm thinking not
a common use case, I'm 60:40 against implementing it.