[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
bug#8200: cp -lr uses a lot of CPU time.
From: |
Rogier Wolff |
Subject: |
bug#8200: cp -lr uses a lot of CPU time. |
Date: |
Tue, 8 Mar 2011 17:35:22 +0100 |
User-agent: |
Mutt/1.5.13 (2006-08-11) |
Hi Jim & others,
Aaargh... It seems the bug has been fixed... Feel free to ignore my
explanation below.
On Tue, Mar 08, 2011 at 04:05:04PM +0100, Jim Meyering wrote:
> For starters, what version of cp did you use?
> Run cp --version
-> cp (GNU coreutils) 8.5
> > Top reports:
> >
> > PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
> > 26721 root 20 0 2456 720 468 R 58.0 0.1 65:32.60 cp
> > 2855 root 20 0 2560 936 624 R 40.8 0.1 30:30.52 cp
> >
> > and I doubt they are half way through.
> >
> > I wrote an application I call "cplr" which does the obvious in the
> > obvious manner, and it doesn't have this problem.
> >
> > I've run "strace", and determined that it is not doing systemcalls
> > that take much CPU time. Most system calls return in microseconds.
>
> Please give us a sample listing of the syscalls that strace
> shows you when you trace one of those long-running cp commands.
> A few hundred lines worth would be good.
I ran:
time strace -tttTp 11453 | & head -1000 | awk '{print ($1-t)*1000 , $0 ;
t=$1;}'
to get the output of 1000 of the process' system calls.
Previously I had omitted the "*1000" which made things harder to read,
and I hadn't noticed that the mkdir calls were the calls that took a
long time.....
My own "cplr" program I started one time without any arguments and it
said:
=> Usage: cplr srcdir dstdir
=> Copy srcdir to dstdir by making hardlinks
=> (Like cp -lR, but without consuming lots of memory)
So apparently the problem we ran into when I made that was that cp was
consuming much too much memory.... This apparently has been fixed
in the meantime.
Here is a typical section of the strace output. This is from my own
cplr program, as the "cp" has scrolled out of my screen and I've
stopped the cp -lr, as the problem has been fixed.
0.0741482 1299598743.435264
link("current/linux-2.6.0-test2-clean/fs/file_table.c",
"test2/linux-2.6.0-test2-clean/fs/file_table.c") = 0 <0.000047>
0.133991 1299598743.435398
link("current/linux-2.6.0-test2-clean/fs/read_write.c",
"test2/linux-2.6.0-test2-clean/fs/read_write.c") = 0 <0.000036>
0.122786 1299598743.435521
link("current/linux-2.6.0-test2-clean/fs/xattr_acl.c",
"test2/linux-2.6.0-test2-clean/fs/xattr_acl.c") = 0 <0.000041>
0.13113 1299598743.435652 link("current/linux-2.6.0-test2-clean/fs/jffs2",
"test2/linux-2.6.0-test2-clean/fs/jffs2") = -1 EPERM (Operation not permitted)
<0.000046>
0.740051 1299598743.436392 lstat64("current/linux-2.6.0-test2-clean/fs/jffs2",
{st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0 <0.000015>
0.119925 1299598743.436512 open("current/linux-2.6.0-test2-clean/fs/jffs2/",
O_RDONLY|O_NONBLOCK|O_LARGEFILE|O_DIRECTORY|O_CLOEXEC) = 6 <0.000022>
0.0889301 1299598743.436601 mkdir("test2/linux-2.6.0-test2-clean/fs/jffs2/",
0777) = 0 <0.031938>
32.057 1299598743.468658 getdents(6, /* 36 entries */, 32768) = 776 <0.000317>
> What type of file system are you using, and is it nearly full?
> Run this from e.g, the source directory: df -hT .
Filesystem Type Size Used Avail Use% Mounted on
/dev/md3 ext3 2.7T 2.4T 190G 93% /backup
> Ideally, you'd attach to one of those processes with gdb and step
> through the code enough to tell us where it's spending its time,
> presumably in coreutils-8.10/src/copy.c. Just running "gdb -p
> 26721" (where 26721 is the PID of one of your running cp processes)
> and typing "backtrace" at the prompt may give us a good clue.
It's spending time in mkdir. It's visible from the strace output.
For a sample directory, download the linux source code, unpack it some
300 times into different subdirs (after unpacking, rename the
resulting tree to linux-1 linux-2, etc.)
(My count comes to 325 copies, but many are 2.4, so a lot smaller than
current kernels).
> Next best, you would give us access to your system or a copy of your
> hierarchy.
> But we don't even ask for that, because that's rarely feasible.
> Next best: you would give us the output of these two commands:
> [if you can do this, please respond privately, not to the list]
>
> find YOUR_SRC_DIR -ls | xz -e > src.find.xz
> find YOUR_DST_DIR -ls | xz -e > dst.find.xz
>
> [if you don't have xz, install it or use bzip2 -9 instead of "xz -e";
> xz is better]
>
> With that, we'd get an idea of hard link counts and max/average
> number of entries per directory, name length, etc.
>
> However, most people don't want to share file names like that.
> If you can, please put those two compressed files somewhere like
> an upload site and reply with links to them.
> Otherwise, please give us some statistics describing your
> two hierarchies by running these commands:
>
> These give counts of files and directories for each of your source
> and destination directories:
dest dir is created by the cp -lr. so it starts out empty, and ends up with
the same number as the source dir. :-)
> find YOUR_SRC_DIR -type f |wc -l
About 4.7 million.
> find YOUR_SRC_DIR -type d |wc -l
About 325000.
> Print the total number of links for each of those directories:
You say links for directories, but your command counts the links on
the files...
> find YOUR_SRC_DIR -type f -printf '%n\n'|awk '{s += $1} END {printf
> "%F\n", s}'
539 million. So about 100 links to each file on average.
Rogier.
--
** address@hidden ** http://www.BitWizard.nl/ ** +31-15-2600998 **
** Delftechpark 26 2628 XH Delft, The Netherlands. KVK: 27239233 **
*-- BitWizard writes Linux device drivers for any device you may have! --*
Q: It doesn't work. A: Look buddy, doesn't work is an ambiguous statement.
Does it sit on the couch all day? Is it unemployed? Please be specific!
Define 'it' and what it isn't doing. --------- Adapted from lxrbot FAQ