coreutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Pair-wise file operation (copy, link)


From: Yair Lenga
Subject: Re: Pair-wise file operation (copy, link)
Date: Sun, 25 Aug 2024 09:43:04 -0400

Hi Padraig,

After thinking more about my use case - I can narrow it further to linking.
If the 'ln' command will support pair-wise linking - it's
relatively trivial to implement the cp from that point:
mkdir tree
ln --pair src1/file1 tree/dest1/name1 src2/file2 tree/dest2/name2
src3/file3 tree/dest3/name3 ...
cp -r tree final-destination
rm -f tree

Even better, the '--pairs' mode can be set to take input from stdin, which
will remove limits on the number of files.

Hope the reduced scope will make it more likely to get approved.

Yair

On Sun, Aug 25, 2024 at 9:18 AM Yair Lenga <yair.lenga@gmail.com> wrote:

> Hi. Thanks for looking into this my request.
>
> In my case, I have to bulk-move about 2500 files. This is part of a
> recurring sync job that has to mirror an existing hierarchy into a new
> hierarchy with different naming rules.
>
> It takes no time to create the mapping (even in bash script, case
> statement). When I "pipe" the mapping into "ln" (with xargs) it takes >2
> min to create the symlinks. Practically, all the time is spent on launching
> "ln". With a custom perl script - it's 3 seconds.
>
> While technically a performance issue, the 2 minute is beyond my SLA (1
> minute to do the sync, including ). With the xargs approach, the solution
> does not scale well.
>
> I believe this function will have broader use cases: I researched
> Stack-Overflow before posting, and could not find any good Linux solution
> (other than custom scripts). I believe effort is relatively small.
>
> Yair
>
> On Sun, Aug 25, 2024 at 8:22 AM Pádraig Brady <P@draigbrady.com> wrote:
>
>> On 25/08/2024 12:39, Yair Lenga wrote:
>> > Greetings!,
>> >
>> > The 'cp' and 'ln' command provides the ability to perform 'bulk'
>> operation,
>> > by specifying multiple source files and a target destination. In
>> > addition to convenience, this approach provides significant
>> > performance benefits, compared with running multiple cp/ln commands, one
>> > for each file.
>> >
>> > One limit of the bulk approach is that the destination file name must
>> match
>> > the source files name. There are multiple cases where this assumption is
>> > not true. Common example include copying of source files into a new
>> > directory layout, or renaming of files that can not be expressed based
>> on
>> > rules.
>> >
>> > One common solution is to use a scripting engine (Python, Perl, Node),
>> > which can perform the operation without spinning a process per file.
>> Simple
>> > for basic operation (especially, bulk renames), but those tools do not
>> have
>> > the power/flexibility of cp,
>> >
>> > My suggestion: Add '--pairs' option to those tools, which will allow the
>> > tool to work on pairs of arguments, within the same invocation.
>> >
>> > cp --pairs ... -- source1 dest1 source2 dest2 source3 dest3
>> > mv --pairs ... -- source1 dest1 source2 dest2 source3 dest3
>> >
>> > Hope that development team will consider adding this feature into future
>> > release of cp/mv
>>
>> An interesting proposal (for cp,ln,mv,install).
>> It is a bit of an edge case though, so I'm inclined to think
>> a performance only interface change is not worth it here?
>>
>> thank you,
>> Pádraig
>>
>


reply via email to

[Prev in Thread] Current Thread [Next in Thread]