bug-findutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Why is `find -name '*.txt'` much slower than '*.txt' on glusterfs?


From: Peng Yu
Subject: Re: Why is `find -name '*.txt'` much slower than '*.txt' on glusterfs?
Date: Wed, 24 Jan 2018 14:17:54 -0600

Here are the results of `strace -c` and the runtime with `-maxdepth 1`
for `find`.

$ time find -maxdepth 1 -name '*.tsv' > /dev/null

real    0m21.118s
user    0m0.446s
sys    0m0.577s
$ time find -name '*.tsv' > /dev/null

real    0m21.277s
user    0m0.454s
sys    0m0.636s
$ time ./main.sh  > /dev/null

real    0m2.695s
user    0m0.046s
sys    0m0.057s
$ cat main.sh
#!/usr/bin/env bash
# vim: set noexpandtab tabstop=2:

echo *.tsv
$ strace -c find -maxdepth 1 -name '*.tsv' > /dev/null
% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
 93.87    0.433259          18     23439           newfstatat
  5.70    0.026288          25      1067           getdents
  0.16    0.000725          45        16        10 open
  0.06    0.000259          22        12           mmap
  0.04    0.000190          32         6           read
  0.03    0.000143           3        54           brk
  0.03    0.000132          22         6           mprotect
  0.03    0.000125          18         7           fstat
  0.02    0.000109         109         1         1 access
  0.02    0.000075           7        11           close
  0.01    0.000064          21         3         3 stat
  0.01    0.000036           4        10           fcntl
  0.01    0.000035          35         1           execve
  0.01    0.000033          33         1           munmap
  0.00    0.000019           6         3         2 ioctl
  0.00    0.000017           1        21           write
  0.00    0.000017          17         1           arch_prctl
  0.00    0.000016          16         1           uname
  0.00    0.000014           7         2           fstatfs
  0.00    0.000000           0         1           fchdir
  0.00    0.000000           0         1           sysinfo
  0.00    0.000000           0         1           openat
------ ----------- ----------- --------- --------- ----------------
100.00    0.461556                 24665        16 total

$ strace -c ./main.sh > /dev/null
% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
 77.26    0.009637           9      1067           getdents
  7.50    0.000936          49        19         8 open
  3.99    0.000498          26        19        17 execve
  2.38    0.000297           3        94           brk
  2.18    0.000272          13        21           mmap
  1.48    0.000185           4        46        23 stat
  1.15    0.000144          12        12           mprotect
  1.05    0.000131          15         9           read
  0.99    0.000123          15         8         4 access
  0.62    0.000077           8        10           fstat
  0.51    0.000064           6        11           close
  0.24    0.000030          30         1           munmap
  0.24    0.000030           2        14           rt_sigaction
  0.14    0.000017           9         2           arch_prctl
  0.07    0.000009           2         6           rt_sigprocmask
  0.06    0.000008           8         1           sysinfo
  0.02    0.000003           1         4         4 ioctl
  0.02    0.000002           0        70           write
  0.02    0.000002           2         1           uname
  0.02    0.000002           0         5           getuid
  0.02    0.000002           0         5           getgid
  0.02    0.000002           0         5           geteuid
  0.02    0.000002           0         5           getegid
  0.00    0.000000           0         3           lseek
  0.00    0.000000           0         1           dup2
  0.00    0.000000           0         1           getpid
  0.00    0.000000           0         3         1 fcntl
  0.00    0.000000           0         2           getrlimit
  0.00    0.000000           0         1           getppid
  0.00    0.000000           0         1           getpgrp
------ ----------- ----------- --------- --------- ----------------
100.00    0.012473                  1447        57 total

On Wed, Jan 24, 2018 at 1:39 AM, Bernhard Voelker
<address@hidden> wrote:
> On 01/24/2018 01:44 AM, Peng Yu wrote:
>>
>> The attached files are the strace results for `echo` and `find`. Can
>> anybody check if there is a way to improve the performance of `find`
>> so that it can work as efficient as `echo` in this test case? Thanks.
>>
>> $ cat main.sh
>> #!/usr/bin/env bash
>> # vim: set noexpandtab tabstop=2:
>>
>> echo *.txt
>> $ strace ./main.sh 2>/tmp/echo_strace.txt
>> $ strace find -name '*.txt' > /dev/null 2>/tmp/find_strace.txt
>
>
> First of all, please refrain from attaching such huge files when
> sending to mailing lists like this; either upload them to a web
> paste bin, or at least compress the files, e.g. the larger file
> could have wasted only <100k instead of 2.3M.  Thanks.
>
> Regarding the strace outputs: you did neither of the tips of
> James (use "strace -c ...") nor of Dale (use "find -maxdepth 1 ..."),
> so just from the number of system calls one could already guess
> that the time is spent by the newfstatat() calls.
>
> We don't see what the previous getdents() calls return (strace -v),
> but it seems that it doesn't include D_TYPE information on glusterfs.
> Therefore, as you omitted the '-maxdepth 1' argument, find needs
> to dig deeper to check if any of the entries have been a directory
> (it would need to recurse to).
>
> BTW: you already got the same answer on your cross-posting [1].
> https://lists.gnu.org/r/coreutils/2018-01/msg00058.html
>
> Have a nice day,
> Berny
>
>
>



-- 
Regards,
Peng



reply via email to

[Prev in Thread] Current Thread [Next in Thread]