bug-coreutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: stat() order performance issues


From: Phillip Susi
Subject: Re: stat() order performance issues
Date: Fri, 26 Jan 2007 15:42:54 -0500
User-agent: Thunderbird 1.5.0.9 (Windows/20061207)

Jim Meyering wrote:
That's good, but libc version matters too.
And the kernel version.  Here, I have linux-2.6.18 and
Debian/unstable's libc-2.3.6.

How does the kernel or libc version matter at all? What matters is the on disk filesystem layout and how it is not optimized for fetching stat information on files in what is essentially a random order, instead of inode order. In the case of ext2/3, the inodes are stored on disk in numerical order, and for reiserfs, they tend to be stored in order, but don't have to be. On ext2/3 I believe file names are stored in the order they were created in, and on reiserfs, they are stored in order of their hash. In both cases the ordering of inodes and the ordering of names returned from readdir are essentially randomly related.

Anyhow, I am running kernel 2.6.15 and libc 2.3.6.

10-15 minutes is very bad.
Something needs an upgrade.

Or a bugfix/enhancement, unless there already is a newer version of coreutils that stats in inode order. My version of coreutils is 5.93.

I presume you used xargs -- you wouldn't run stat 114K times...

Yes....

ls -Ui > files
cat files | sort -g | cut -c 9- > files-sorted
cat files | cut -c 9- > files-unsorted
time cat files-unsorted | xargs stat > /dev/null
< clear cache >
time cat files-sorted | xargs stat > /dev/null


Sorting by inode number made the stats at least 10 times faster.






reply via email to

[Prev in Thread] Current Thread [Next in Thread]