[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Why is `find -name '*.txt'` much slower than '*.txt' on glusterfs?

From: Bernhard Voelker
Subject: Re: Why is `find -name '*.txt'` much slower than '*.txt' on glusterfs?
Date: Sun, 28 Jan 2018 19:57:28 +0100
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.5.2

On 01/27/2018 06:45 PM, Peng Yu wrote:
glusterfs doesn't provide D_TYPE information:

getdents(4, {{d_ino=10054722685526780333, ..., d_type=DT_UNKNOWN} ...

Nevertheless, it is strange that find calls newfstatat() also
in the case of "-maxdepth 1" - it shouldn't need to.

Should this be considered as a performance bug of 'find'?

well, maybe.

I could reproduce this case with sshfs where getdents also returns DT_UNKNOWN.

  $ mkdir -p ~/tmp/d1 \
      && seq 10000 | xargs env -C ~/tmp/d1 touch

  $ mkdir -p ~/tmp/mnt \
      && sshfs localhost:tmp/d1 ~/tmp/mnt

  $ strace -ve getdents,newfstatat find ~/tmp/mnt -maxdepth 1

  $ strace -ve getdents,newfstatat find -D search ~/tmp/mnt -maxdepth 1 -name 

The problem seems to be that gnulibs' fts_read() already tries to determine
whether the current item is a directory [1]:

  getdents(4, [], 32768)                  = 0
  newfstatat(5, "8793", {st_dev=makedev(0, 46), st_ino=2, st_mode=S_IFREG|0644, 

before find() sees it [2]:

  consider_visiting (early): ‘/home/berny/tmp/mnt/8793’: fts_info=FTS_F , [...]

@James: do you have an idea how to work around this?


Have a nice day,

reply via email to

[Prev in Thread] Current Thread [Next in Thread]