[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: split --filter=CMD: last call
From: |
Jim Meyering |
Subject: |
Re: split --filter=CMD: last call |
Date: |
Tue, 03 May 2011 11:43:09 +0200 |
Pádraig Brady wrote:
> On 03/05/11 09:54, Jim Meyering wrote:
>>
>> [PATCH 1/3] split: accept new output --filter=CMD option
>> [PATCH 2/3] tests: test split's new --filter=CMD option
>> [PATCH 3/3] doc: document split's new --filter=CMD option
>
> I should also play devil's advocate and point
> out the existing alternative to split --filter.
>
> # Set filter as required
> filter() { cat $FILE | gzip -c > $FILE.gz; }
> #create fifos
> rm -f x??
> split -n3 /dev/null
> for f in x??; do rm $f && mkfifo $f; done
> #consumer
> for FILE in x??; do filter& done
> #producer
> split -n3 file
> #cleanup
> wait
> rm -f x??
>
> The above is sufficiently onerous to make split --filter
> a useful option. Though I've a niggling feeling that the
> above could be the basis of a contrib helper script.
Onerous indeed ;-)
especially if the helper script must honor the PREFIX
and must be able to predict how many files the final
invocation of split will create. The latter is inherently
impossible due to a race condition. I suppose you could
copy the input to a temporary file that you "know" won't
be changing, split it once to determine the names to use
for each $FILE/fifo, and then run it again to perform
the final split-to-fifo.
One advantage of the helper script approach is that it makes it
easier to use the shell (or a tool like parallel) to parallelize
the filter processes.
- split --filter=CMD: last call, Jim Meyering, 2011/05/03
- [PATCH 2/3] tests: test split's new --filter=CMD option, Jim Meyering, 2011/05/03
- [PATCH 3/3] doc: document split's new --filter=CMD option, Jim Meyering, 2011/05/03
- [PATCH 1/3] split: accept new output --filter=CMD option, Jim Meyering, 2011/05/03
- Re: split --filter=CMD: last call, Pádraig Brady, 2011/05/03
- Re: split --filter=CMD: last call,
Jim Meyering <=
- Re: split --filter=CMD: last call, Jim Meyering, 2011/05/06