bug-coreutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: stat() order performance issues


From: Phillip Susi
Subject: Re: stat() order performance issues
Date: Fri, 26 Jan 2007 11:53:56 -0500
User-agent: Thunderbird 1.5.0.9 (Windows/20061207)

Jim Meyering wrote:
Which ls option(s) are you using?

I used ls -Ui to list the inode number and do not sort. I expected this to simply return the contents from getdents, but I see stat64 calls on each file, I believe in the order they are returned by getdents in, which causes a massive seek storm.

Which file system?  As you probably know, it really matters.

In my case, reiserfs, but this should apply equally as well to ext2/3.

If it's just "ls -U", then ls may not have to perform a single "stat" call.
If it's "ls -l", then the stat per file is inevitable.
But if it's "ls --inode" or "ls --file-type", with the right file system,
ls gets all it needs via readdir, and can skip all stat calls.  But with
some other file system types, it still has to stat every file.


It seems that ls -U does not stat, but ls -Ui does. It seems it shouldn't because the name and inode number are returned by readdir aren't they?

For example, when I run "ls --file-type" on three maildirs containing
over 160K entries, it's nearly instantaneous.  There are only 3 stat calls:

    $ strace -c ls -1 a b c|wc -l

Are a, b and c files or directories? If they are files, then of course it would only stat 3 times, because you have only asked ls to look up 3 files. Try just ls -Ui without the a b c parameters.

du in a Maildir with many thousands of small files takes ages to
complete.  I have investigated and believe this is due to the order in

Yep.  du has to perform the stat calls.

"ages"?  Give us numbers.  Is NFS involved?  A slow disk?
I've just run "du -s" on a directory containing almost 70,000 entries,
and it didn't take *too* long with a cold cache: 21 seconds.

Modest disk, no NFS, 114k entries, and it takes 10-15 minutes with cold cache. When I sorted the directory listing by inode number and ran stat on each in that order with cold caches, it only took something like 1 minute.

Post your patch, so others can try easily.
If sorting entries (when possible, i.e., for du, and some invocations
of ls) before stating them really does result in a 10x speed-up on
"important" systems, then there's a good chance we'll do it.

I have no patch, I merely did some instrumentation with shell scripts, ls, and stat.






reply via email to

[Prev in Thread] Current Thread [Next in Thread]