[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Processing files from a tar archive in parallel
From: |
Ole Tange |
Subject: |
Re: Processing files from a tar archive in parallel |
Date: |
Tue, 29 Mar 2011 23:13:48 +0200 |
On Tue, Mar 29, 2011 at 10:14 PM, Jay Hacker <jayqhacker@gmail.com> wrote:
> On Tue, Mar 29, 2011 at 11:20 AM, Hans Schou <chlor@schou.dk> wrote:
>> On Tue, 29 Mar 2011, Jay Hacker wrote:
>>
>>> I have a large gzipped tar archive containing many small files; just
>>> untarring it takes a lot of time and space. I'd like to be able to process
>>> each file in the archive, ideally without untarring the whole thing first,
:
>> tar xvf big-file.tar.gz | parallel echo "Proc this file {}"
>>
>> Parallel will start when the first file is untared.
:
> That is a great idea. However, can I be sure the file is completely
> written to disk before tar prints the filename?
While I loved Hans' idea, it does indeed have a race condition. This
should run 'ls -l' on each file after decompressing and clearly fails
now and then:
$ tar xvf ../i.tgz | parallel ls -l > ls-l
ls: cannot access 1792: No such file or directory
ls: cannot access 209: No such file or directory
ls: cannot access 21: No such file or directory
ls: cannot access 2256: No such file or directory
ls: cannot access 2349: No such file or directory
ls: cannot access 2363: No such file or directory
ls: cannot access 246: No such file or directory
ls: cannot access 2712: No such file or directory
But you could unpack in a new dir and use:
http://www.gnu.org/software/parallel/man.html#example__gnu_parallel_as_dir_processor
That seems to work.
/Ole
- Processing files from a tar archive in parallel, Jay Hacker, 2011/03/29
- Re: Processing files from a tar archive in parallel, Hans Schou, 2011/03/29
- Re: Processing files from a tar archive in parallel, Jay Hacker, 2011/03/29
- Re: Processing files from a tar archive in parallel, Hans Schou, 2011/03/29
- Re: Processing files from a tar archive in parallel,
Ole Tange <=
- RE: Processing files from a tar archive in parallel, Cook, Malcolm, 2011/03/29
- RE: Processing files from a tar archive in parallel, Cook, Malcolm, 2011/03/29
- Re: Processing files from a tar archive in parallel, Ole Tange, 2011/03/29
- RE: Processing files from a tar archive in parallel, Cook, Malcolm, 2011/03/30
- Re: Processing files from a tar archive in parallel, Jay Hacker, 2011/03/30
- Re: Processing files from a tar archive in parallel, Hans Schou, 2011/03/29
- Re: Processing files from a tar archive in parallel, Ole Tange, 2011/03/29
- Re: Processing files from a tar archive in parallel, Benjamin R. Haskell, 2011/03/30
Re: Processing files from a tar archive in parallel, Benjamin R. Haskell, 2011/03/30