[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[bug #57693] find wastefully calls stat for leaves

From: Vladimir Panteleev
Subject: [bug #57693] find wastefully calls stat for leaves
Date: Wed, 29 Jan 2020 07:49:30 -0500 (EST)
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:72.0) Gecko/20100101 Firefox/72.0


                 Summary: find wastefully calls stat for leaves
                 Project: findutils
            Submitted by: vpanteleev
            Submitted on: Wed 29 Jan 2020 12:49:28 PM UTC
                Category: find
                Severity: 3 - Normal
              Item Group: None
                  Status: None
                 Privacy: Public
             Assigned to: None
         Originator Name: 
        Originator Email: 
             Open/Closed: Open
         Discussion Lock: Any
                 Release: None
           Fixed Release: None



Consider the invocation:

find /dir -mindepth 1 -maxdepth 1

The expected behavior is for find to print the full paths of all directory
entries in /dir, which it does.

However, as far as I can see, this task should not require find to perform
stat calls on the directory entries of /dir. Nevertheless, it does so.

In certain situations, a mere directory listing is much faster than also
calling stat on every member. In my case, I am seeing a considerable
performance difference when enumerating snapshots (~2000 total) on a btrfs
filesystem located on a HDD. `ls /dir | cat` is almost instantaneous (here,
the output is piped through `cat` so that `ls` doesn't attempt to colorize
entries, which would require `stat` calls). However, the aforementioned `find`
invocation, as well as just `ls`, takes several minutes.

The find manual states the following for the -D option:

    stat   Print messages as files are examined with the stat and lstat system
calls.  The find program tries to minimise such calls.

However, I can observe that find does call stat, without even printing
anything with `-D stat` prepended to its command line.

I can see that find calls stat by attaching to it, as it is running, with gdb,
and examining its backtrace. With findutils
28f11d689dc61f9202de44078d67299419fbad26 and gnulib
a7903da07d3d18c23314aa0815adbb4058fd7cec, here is one instance:

Thread 1 (process 277972):
#0  0x00007f2aa82bdddf in __fxstatat64 () from /usr/lib/libc.so.6
#1  0x00005557b564ff96 in fstatat (__flag=256, __statbuf=0x5557b6204a48,
__filename=<optimized out>, __fd=<optimized out>) at
#2  fts_stat (sp=sp@entry=0x5557b61cbf40, p=p@entry=0x5557b62049d0,
follow=follow@entry=false) at fts.c:1827
#3  0x00005557b565208b in rpl_fts_read (sp=0x5557b61cbf40) at fts.c:1044
#4  0x00005557b5634012 in find (arg=0x7ffc358806ea
"/mnt/2016-hdd-8t-raid/home") at ftsfind.c:561
#5  0x00005557b5633aea in process_all_startpoints (argv=<optimized out>,
argc=<optimized out>) at ftsfind.c:625
#6  main (argc=<optimized out>, argv=<optimized out>) at ftsfind.c:734

It looks like stat is not being called by find directly, but rather the fts
feature of gnulib, so it looks like there is possibly a second bug here (-D
stat not reporting stat calls in gnulib).


Reply to this item at:


  Message sent via Savannah

reply via email to

[Prev in Thread] Current Thread [Next in Thread]