[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Pair-wise file operation (copy, link)
From: |
Carl Edquist |
Subject: |
Re: Pair-wise file operation (copy, link) |
Date: |
Mon, 26 Aug 2024 06:28:33 -0500 (CDT) |
On Sun, 25 Aug 2024, Yair Lenga wrote:
In my case, I have to bulk-move about 2500 files. This is part of a
recurring sync job that has to mirror an existing hierarchy into a new
hierarchy with different naming rules.
It takes no time to create the mapping (even in bash script, case
statement). When I "pipe" the mapping into "ln" (with xargs) it takes
>2 min to create the symlinks. Practically, all the time is spent on
launching "ln". With a custom perl script - it's 3 seconds.
The fork+exec time definitely adds up, although 2 min for 2500 instances
seems pretty slow for Linux, even on old hardware.
Since the actual fs operations are apparently negligible in comparison
(only 3 sec with your perl script), it'd be curious if you don't get a
similar 2+ min time for running /bin/true 2500 times ...
$ time echo {1..2500} | xargs -n1 /bin/true
I believe this function will have broader use cases:
Yeah the pairs mode sounds generally useful (even if it's only really a
performance optimization) ... I can relate to occasionally writing one-off
perl scripts for similar activities, because running ln for a long list of
files one at a time runs annoyingly slow.
I researched Stack-Overflow before posting, and could not find any good
Linux solution (other than custom scripts).
Depending on how the new file paths/names are structured, you might be
able to get away with a combination of 'cp -al src/ dst/' (to copy the dir
hierarchy and all the links in a single call to cp) followed by calling
rename(1) as needed (possibly under find/xargs), if the name mapping is
simple enough to express with text replacements.
... Alternatively, a more general purpose solution is to compile and load
("enable -f") the ln loadable builtin that comes with the bash sources.
Then calls to 'ln' from bash will use a shell builtin rather than
fork+exec'ing the external utility.
(Of course, I'd happily use the new pairs mode if added to cp/mv/ln.)
Carl