[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Why is `find -name '*.txt'` much slower than '*.txt' on glusterfs?

From: James Youngman
Subject: Re: Why is `find -name '*.txt'` much slower than '*.txt' on glusterfs?
Date: Sun, 21 Jan 2018 13:49:47 +0000

On Sat, Jan 20, 2018 at 10:16 AM, Peng Yu <address@hidden> wrote:

> Hi,
> There are ~7000 .txt files in a directory on glusterfs. Here are the run
> time of the following two commands. Does anybody know why the find command
> is much slower than *.txt? Is there a way to change the API that `find`
> uses to search files so that it can be more friendly to
> glusterfs?
> $ time echo *.txt > /dev/null
> real    0m2.206s
> user    0m0.039s
> sys     0m0.056s
> $ time find -name '*.txt' > /dev/null
> real    0m18.558s
> user    0m0.317s
> sys     0m0.663s

Is this an apples-to-apples comparison?   For example does . contain sub
directories?    A comparison of the output of strace -c for both commands
will probably be illuminating.   Perhaps stat calls are relatively
expensive on glusterfs (this happens on at least some other cluster
filesystems because obtaining a correct value fort st_size requires finding
the consensus answer for the current length of the file, while obtaining
the list of items in a directory may not require the same amount of locking
or consensus work


reply via email to

[Prev in Thread] Current Thread [Next in Thread]