coreutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Pair-wise file operation (copy, link)


From: Pádraig Brady
Subject: Re: Pair-wise file operation (copy, link)
Date: Sun, 25 Aug 2024 15:25:57 +0100
User-agent: Mozilla Thunderbird Beta

The reduced scope does help, thanks.
Note we already support --files0-from in wc, sort, du.
Similarly we might support --pairs0-from in ln at least.
Pairs would not be generally distributable with xargs etc. anyway
as it might split a pair over invocations, so restricting to an option seems 
best.

cheers,
Pádraig.

On 25/08/2024 14:43, Yair Lenga wrote:
Hi Padraig,

After thinking more about my use case - I can narrow it further to linking. If 
the 'ln' command will support pair-wise linking - it's relatively trivial to 
implement the cp from that point:
mkdir tree
ln --pair src1/file1 tree/dest1/name1 src2/file2 tree/dest2/name2 src3/file3 
tree/dest3/name3 ...
cp -r tree final-destination
rm -f tree

Even better, the '--pairs' mode can be set to take input from stdin, which will 
remove limits on the number of files.

Hope the reduced scope will make it more likely to get approved.

Yair

On Sun, Aug 25, 2024 at 9:18 AM Yair Lenga <yair.lenga@gmail.com 
<mailto:yair.lenga@gmail.com>> wrote:

    Hi. Thanks for looking into this my request.

    In my case, I have to bulk-move about 2500 files. This is part of a 
recurring sync job that has to mirror an existing hierarchy into a new 
hierarchy with different naming rules.

    It takes no time to create the mapping (even in bash script, case statement). When I "pipe" the 
mapping into "ln" (with xargs) it takes >2 min to create the symlinks. Practically, all the time is 
spent on launching "ln". With a custom perl script - it's 3 seconds.

    While technically a performance issue, the 2 minute is beyond my SLA (1 
minute to do the sync, including ). With the xargs approach, the solution does 
not scale well.

    I believe this function will have broader use cases: I researched 
Stack-Overflow before posting, and could not find any good Linux solution 
(other than custom scripts). I believe effort is relatively small.

    Yair

    On Sun, Aug 25, 2024 at 8:22 AM Pádraig Brady <P@draigbrady.com 
<mailto:P@draigbrady.com>> wrote:

        On 25/08/2024 12:39, Yair Lenga wrote:
         > Greetings!,
         >
         > The 'cp' and 'ln' command provides the ability to perform 'bulk' 
operation,
         > by specifying multiple source files and a target destination. In
         > addition to convenience, this approach provides significant
         > performance benefits, compared with running multiple cp/ln commands, 
one
         > for each file.
         >
         > One limit of the bulk approach is that the destination file name 
must match
         > the source files name. There are multiple cases where this 
assumption is
         > not true. Common example include copying of source files into a new
         > directory layout, or renaming of files that can not be expressed 
based on
         > rules.
         >
         > One common solution is to use a scripting engine (Python, Perl, 
Node),
         > which can perform the operation without spinning a process per file. 
Simple
         > for basic operation (especially, bulk renames), but those tools do 
not have
         > the power/flexibility of cp,
         >
         > My suggestion: Add '--pairs' option to those tools, which will allow 
the
         > tool to work on pairs of arguments, within the same invocation.
         >
         > cp --pairs ... -- source1 dest1 source2 dest2 source3 dest3
         > mv --pairs ... -- source1 dest1 source2 dest2 source3 dest3
         >
         > Hope that development team will consider adding this feature into 
future
         > release of cp/mv

        An interesting proposal (for cp,ln,mv,install).
        It is a bit of an edge case though, so I'm inclined to think
        a performance only interface change is not worth it here?

        thank you,
        Pádraig





reply via email to

[Prev in Thread] Current Thread [Next in Thread]