coreutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Pair-wise file operation (copy, link)


From: Yair Lenga
Subject: Re: Pair-wise file operation (copy, link)
Date: Sun, 25 Aug 2024 09:18:03 -0400

Hi. Thanks for looking into this my request.

In my case, I have to bulk-move about 2500 files. This is part of a
recurring sync job that has to mirror an existing hierarchy into a new
hierarchy with different naming rules.

It takes no time to create the mapping (even in bash script, case
statement). When I "pipe" the mapping into "ln" (with xargs) it takes >2
min to create the symlinks. Practically, all the time is spent on launching
"ln". With a custom perl script - it's 3 seconds.

While technically a performance issue, the 2 minute is beyond my SLA (1
minute to do the sync, including ). With the xargs approach, the solution
does not scale well.

I believe this function will have broader use cases: I researched
Stack-Overflow before posting, and could not find any good Linux solution
(other than custom scripts). I believe effort is relatively small.

Yair

On Sun, Aug 25, 2024 at 8:22 AM Pádraig Brady <P@draigbrady.com> wrote:

> On 25/08/2024 12:39, Yair Lenga wrote:
> > Greetings!,
> >
> > The 'cp' and 'ln' command provides the ability to perform 'bulk'
> operation,
> > by specifying multiple source files and a target destination. In
> > addition to convenience, this approach provides significant
> > performance benefits, compared with running multiple cp/ln commands, one
> > for each file.
> >
> > One limit of the bulk approach is that the destination file name must
> match
> > the source files name. There are multiple cases where this assumption is
> > not true. Common example include copying of source files into a new
> > directory layout, or renaming of files that can not be expressed based on
> > rules.
> >
> > One common solution is to use a scripting engine (Python, Perl, Node),
> > which can perform the operation without spinning a process per file.
> Simple
> > for basic operation (especially, bulk renames), but those tools do not
> have
> > the power/flexibility of cp,
> >
> > My suggestion: Add '--pairs' option to those tools, which will allow the
> > tool to work on pairs of arguments, within the same invocation.
> >
> > cp --pairs ... -- source1 dest1 source2 dest2 source3 dest3
> > mv --pairs ... -- source1 dest1 source2 dest2 source3 dest3
> >
> > Hope that development team will consider adding this feature into future
> > release of cp/mv
>
> An interesting proposal (for cp,ln,mv,install).
> It is a bit of an edge case though, so I'm inclined to think
> a performance only interface change is not worth it here?
>
> thank you,
> Pádraig
>


reply via email to

[Prev in Thread] Current Thread [Next in Thread]