|
From: | Hong Yang |
Subject: | Re: csplit feature request: allow a user to specify which pieces to output |
Date: | Wed, 12 Mar 2014 10:17:42 -0500 |
On 03/12/2014 04:26 AM, Hong Yang wrote:So this is a borderline one.
> If a user is splitting a large file into pieces only to use several of them, it will be efficient to just specify the indexes of files to output.
>
> Take an extreme case for example. "a_large_file" has 12823371193 lines. "csplit a_large_file 823371193 823371293" will have three output files: xx00 with 823371192 lines, xx01 with 100 lines, and xx02 with the rest. If a user is only interested in xx01, it will be desirable to do "csplit a_large_file 823371193 823371293 -o 1."
Functionally it is useful as it can avoid redundant
processing and storage.
split(1) has a similar feature in that one can use -n K/N
to split the Kth item out of N. It doesn't support -o N,
to select an arbitrary chunk based on size, as that can be done
for single chunks with dd.
If we were to implement for csplit, we'd probably implement
for split also, and support specifying multiple (disjoint) chunks.
However given it is only a performance feature and I'm thinking not
a common use case, I'm 60:40 against implementing it.
thanks,
Pádraig.
[Prev in Thread] | Current Thread | [Next in Thread] |