coreutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [coreutils] cp --parents parallel


From: Rob Gom
Subject: Re: [coreutils] cp --parents parallel
Date: Tue, 19 Oct 2010 08:38:54 +0200

On Tue, Oct 19, 2010 at 12:45 AM, Bob Proulx <address@hidden> wrote:

Hello again,

> Rob Gom wrote:
>> Regarding my example - I am aware of bashism. That was only an example
>> and it was more convenient for me to use bash features.
>
> But you didn't use any bash features that I noticed.

No, I haven't. I am just more familiar with bash syntax, even if not
POSIX. That's all behind it.

>
>> As for the real case - I use it inside makefiles. I copy many
>> directory structures into single root.
>
> And the last one copied wins?  Okay.
>
> I am inclined to suggest using 'rsync' if --update is important.  The
> way that rsync updates the target is different from the way that cp
> updates the target.  The cp command writes it in place.  The rsync
> command uses a temporary file and a rename to ensure that the target
> is never available in a half written state.  Also I am not convinced
> that cp is completely race-condition safe when multiple cp processes
> are writing to the file at the same time.  Could it get duplicated
> data in the file?  I would need to look.

Well, that's not the case. There's make called. Make spawns some
submakes. Every submake calls target which copies files (--parents
--update).
Every submake has its own files list, so only one cp is handling any
given file. But they share some parts of directory structure. Exactly
as in my example.
--update is needed because of rerunning make - in such case I would
need cp to copy only newer files (optimisation).

>
>> If --parents itself works as expected, that will be the easiest
>> solution.
>
> I think it would be reasonable to make --parents work the same as it
> does for 'mkdir --parents' and not complain if the internal mkdir
> fails because the directory already exists.  But even if that is
> patched it will be years before the release containing it flows down
> to most installations where you can count on having it.  Your 7.4 copy
> was released on 2009-05-07 and yet you still have it in place and
> probably will for a while I would imagine.  Current is version 8.6
> released 2010-10-15.  Therefore it would still be wise to avoid the
> race and use techniques that don't exhibit the problem.  The example
> code I posted showed one such way.
>
>> Generally it looks like:
>> target:
>>     cp --parents --update $(FILES_LIST) $(TARGET)
>
> You could easily call 'mkdir -p $(TARGET)' before calling cp, remove
> the --parents, and avoid the problem.

I can't use mkdir -p $(TARGET), because FILES_LIST:
- is a list of files
- with paths
Example:
TARGET=/tmp/a
FILES_LIST=b/c/d/e.txt b/d/f.txt b/g.txt c/h.txt
In such case I would need a rule to create not only /tmp/a, but
/tmp/a/b/c/d, /tmp/a/b/d, /tmp/a/c. Then it gets a bit more
complicated.

Yes, I could handle that, but then I would have to rely on mkdir -p
instead of relying of cp --parents (multiple processes could invoke
mkdir -p with common directory root).

I wrote first email with hope that I missed something obvious, which
doesn't seem to be the case. I will rewrite the rules to use mkdir
instead. Thank you for your opinions.

Regards,
Robert



reply via email to

[Prev in Thread] Current Thread [Next in Thread]