coreutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: "ls -l": Avoid unnecessary getxattr() overhead


From: Jim Meyering
Subject: Re: "ls -l": Avoid unnecessary getxattr() overhead
Date: Thu, 16 Feb 2012 20:39:33 +0100

Sven Breuner wrote:
...
> (Tested with similar results on cifs, ext4, fhgfs.)

Thanks for the testing.

Here are two patches to eliminate all unnecessary getxattr calls
(you've already seen the first) and one more to add a test script.
Sven, it should be interesting to see your speed-up numbers ;-)

The test is interesting in that it uses LD_PRELOAD, and is expected
to be skipped (or fail) on systems that don't support that.
Which makes me realize that I don't yet have code to make the test
skip when $CC and ld succeed but where LD_PRELOAD is not supported.

At first I wanted to write a root-only test that would
create a file system of some type for which *getxattr would always
fail, but with linux-kernel-based systems, that may be a challenge,
since all non-network FS types seem to have some amount of support, now.
I am avoiding tests involving network file systems: that would be
a little too invasive for my taste.

But then Daniel Berrangé suggested to use LD_PRELOAD.  That's better
because the test does not require root privileges (most mount-related
tests would), but comes with a few challenges.  I'll skip the
portability ones for now.  The part that I found surprising is that
from the .so wrapper calls, while I could indeed count the number of
intercepted *getxattr calls, I could *not* access that static count
from an __attribute__((destructor)) function that would simply print
that static variable.  The printed number was always zero, because, as I
found, when the destructor is invoked, the static variable has a different
address than the one seen by the intercepted calls.  My current method
is to write a single line to file descriptor 33 from each intercepted call,
redirect that FD, and to count the lines of the resulting file.
Definitely worthy of the "FIXME" comment in that script.



>From ac6dc056f8a2ed00b86c63af50e949e1ac30dae0 Mon Sep 17 00:00:00 2001
From: Jim Meyering <address@hidden>
Date: Sat, 11 Feb 2012 12:36:57 +0100
Subject: [PATCH 1/3] ls: optimize for when getfilecon would often fail (~30%
 speed-up)

On systems or file systems without SELinux support, all getfilecon
and lgetfilecon calls would fail due to lack of support.  We can non-
invasively cache such failure (on most recently accessed device) and
avoid the vast majority of the failing underlying getxattr syscalls.
* src/ls.c (errno_unsupported): New function.
(selinux_challenged_device): New file-scoped global.
(getfilecon_cache, lgetfilecon_cache): New error-caching wrapper
functions.
(gobble_file): Use the caching wrappers, for when many *getfilecon
calls would fail with ENOTSUP or EOPNOTSUPP.
Suggested by Sven Breuner in
http://thread.gmane.org/gmane.comp.gnu.coreutils.general/2187
---
 src/ls.c |   52 +++++++++++++++++++++++++++++++++++++++++++++++++---
 1 file changed, 49 insertions(+), 3 deletions(-)

diff --git a/src/ls.c b/src/ls.c
index f5cd37a..9e8806f 100644
--- a/src/ls.c
+++ b/src/ls.c
@@ -2777,10 +2777,56 @@ clear_files (void)
   file_size_width = 0;
 }

+/* Return true if ERR implies lack-of-support failure by a
+   getxattr-calling function like getfilecon.  */
+static bool
+errno_unsupported (int err)
+{
+  return err == ENOTSUP || err == EOPNOTSUPP;
+}
+
+/* st_dev of the most recently processed device for which
+   we've found that getfilecon or lgetfilecon fails with
+   e.g., ENOTSUP or EOPNOTSUPP.  */
+static dev_t selinux_challenged_device;
+
+/* Cache getfilecon failure, when it's trivial to do.
+   Like getfilecon, but when F's st_dev says it's on a known-
+   SELinux-challenged file system, fail with ENOTSUP immediately.  */
+static int
+getfilecon_cache (char const *file, struct fileinfo *f)
+{
+  if (f->stat.st_dev == selinux_challenged_device)
+    {
+      errno = ENOTSUP;
+      return -1;
+    }
+  int r = getfilecon (file, &f->scontext);
+  if (r < 0 && errno_unsupported (errno))
+    selinux_challenged_device = f->stat.st_dev;
+  return r;
+}
+
+/* Cache lgetfilecon failure, when it's trivial to do.
+   Like lgetfilecon, but when F's st_dev says it's on a known-
+   SELinux-challenged file system, fail with ENOTSUP immediately.  */
+static int
+lgetfilecon_cache (char const *file, struct fileinfo *f)
+{
+  if (f->stat.st_dev == selinux_challenged_device)
+    {
+      errno = ENOTSUP;
+      return -1;
+    }
+  int r = lgetfilecon (file, &f->scontext);
+  if (r < 0 && errno_unsupported (errno))
+    selinux_challenged_device = f->stat.st_dev;
+  return r;
+}
+
 /* Add a file to the current table of files.
    Verify that the file exists, and print an error message if it does not.
    Return the number of blocks that the file occupies.  */
-
 static uintmax_t
 gobble_file (char const *name, enum filetype type, ino_t inode,
              bool command_line_arg, char const *dirname)
@@ -2919,8 +2965,8 @@ gobble_file (char const *name, enum filetype type, ino_t 
inode,
           bool have_selinux = false;
           bool have_acl = false;
           int attr_len = (do_deref
-                          ?  getfilecon (absolute_name, &f->scontext)
-                          : lgetfilecon (absolute_name, &f->scontext));
+                          ? getfilecon_cache (absolute_name, f)
+                          : lgetfilecon_cache (absolute_name, f));
           err = (attr_len < 0);

           if (err == 0)
--
1.7.9.1.232.g1a183


>From 796b05173ad9d34a3403e2201152a1a81fd973a1 Mon Sep 17 00:00:00 2001
From: Jim Meyering <address@hidden>
Date: Mon, 13 Feb 2012 12:05:40 +0100
Subject: [PATCH 2/3] ls: also cache ACL- and CAP-querying syscall failures

Like the optimization to avoid always-failing getfilecon calls,
this change avoids always-failing queries for whether a file has
a nontrivial ACL and for whether a file has certain "capabilities".
Those queries often resolve to getxattr syscalls, and when they
fail for one query (indicating no support), they are known always
to fail that way for the affected device.  With this change-set, we
have thus eliminated nearly all failing-unsupported getxattr syscalls.
* src/ls.c (has_capability) [!HAVE_CAP]: Set errno to ENOTSUP.
(errno_unsupported): Expand the list of E* errno values to match
that of lib/acl-internal.h's ACL_NOT_WELL_SUPPORTED macro.
(file_has_acl_cache, has_capability_cache): New functions.
(gobble_file): Use them in place of non-caching ones.
Suggested by Sven Breuner in
http://thread.gmane.org/gmane.comp.gnu.coreutils.general/2187
---
 src/ls.c |   61 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++----
 1 file changed, 57 insertions(+), 4 deletions(-)

diff --git a/src/ls.c b/src/ls.c
index 9e8806f..b7c7af6 100644
--- a/src/ls.c
+++ b/src/ls.c
@@ -2736,6 +2736,7 @@ has_capability (char const *name)
 static bool
 has_capability (char const *name ATTRIBUTE_UNUSED)
 {
+  errno = ENOTSUP;
   return false;
 }
 #endif
@@ -2778,11 +2779,16 @@ clear_files (void)
 }

 /* Return true if ERR implies lack-of-support failure by a
-   getxattr-calling function like getfilecon.  */
+   getxattr-calling function like getfilecon or file_has_acl.  */
 static bool
 errno_unsupported (int err)
 {
-  return err == ENOTSUP || err == EOPNOTSUPP;
+  return (err == EBUSY
+          || err == EINVAL
+          || err == ENOENT
+          || err == ENOSYS
+          || err == ENOTSUP
+          || err == EOPNOTSUPP);
 }

 /* st_dev of the most recently processed device for which
@@ -2824,6 +2830,53 @@ lgetfilecon_cache (char const *file, struct fileinfo *f)
   return r;
 }

+/* Cache file_has_acl failure, when it's trivial to do.
+   Like file_has_acl, but when F's st_dev says it's on a file
+   system lacking ACL support, return 0 with ENOTSUP immediately.  */
+static int
+file_has_acl_cache (char const *file, struct fileinfo *f)
+{
+  /* st_dev of the most recently processed device for which we've
+     found that file_has_acl fails indicating lack of support.  */
+  static dev_t unsupported_device;
+
+  if (f->stat.st_dev == unsupported_device)
+    {
+      errno = ENOTSUP;
+      return 0;
+    }
+
+  /* Zero errno so that we can distinguish between two 0-returning cases:
+     "has-ACL-support, but only a default ACL" and "no ACL support". */
+  errno = 0;
+  int n = file_has_acl (file, &f->stat);
+  if (n <= 0 && errno_unsupported (errno))
+    unsupported_device = f->stat.st_dev;
+  return n;
+}
+
+/* Cache has_capability failure, when it's trivial to do.
+   Like has_capability, but when F's st_dev says it's on a file
+   system lacking capability support, return 0 with ENOTSUP immediately.  */
+static bool
+has_capability_cache (char const *file, struct fileinfo *f)
+{
+  /* st_dev of the most recently processed device for which we've
+     found that has_capability fails indicating lack of support.  */
+  static dev_t unsupported_device;
+
+  if (f->stat.st_dev == unsupported_device)
+    {
+      errno = ENOTSUP;
+      return 0;
+    }
+
+  bool b = has_capability (file);
+  if ( !b && errno_unsupported (errno))
+    unsupported_device = f->stat.st_dev;
+  return b;
+}
+
 /* Add a file to the current table of files.
    Verify that the file exists, and print an error message if it does not.
    Return the number of blocks that the file occupies.  */
@@ -2958,7 +3011,7 @@ gobble_file (char const *name, enum filetype type, ino_t 
inode,
       /* Note has_capability() adds around 30% runtime to 'ls --color'  */
       if ((type == normal || S_ISREG (f->stat.st_mode))
           && print_with_color && is_colored (C_CAP))
-        f->has_capability = has_capability (absolute_name);
+        f->has_capability = has_capability_cache (absolute_name, f);

       if (format == long_format || print_scontext)
         {
@@ -2985,7 +3038,7 @@ gobble_file (char const *name, enum filetype type, ino_t 
inode,

           if (err == 0 && format == long_format)
             {
-              int n = file_has_acl (absolute_name, &f->stat);
+              int n = file_has_acl_cache (absolute_name, f);
               err = (n < 0);
               have_acl = (0 < n);
             }
--
1.7.9.1.232.g1a183


>From 92556f46359798797963b9606e60994b9d31b1b9 Mon Sep 17 00:00:00 2001
From: Jim Meyering <address@hidden>
Date: Mon, 13 Feb 2012 12:52:19 +0100
Subject: [PATCH 3/3] tests: test for getxattr speed-up

* tests/ls/getxattr-speedup: New test.
* tests/Makefile.am (TESTS): Add it.
---
 tests/Makefile.am         |    1 +
 tests/ls/getxattr-speedup |   60 +++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 61 insertions(+)
 create mode 100755 tests/ls/getxattr-speedup

diff --git a/tests/Makefile.am b/tests/Makefile.am
index 7b53681..625b636 100644
--- a/tests/Makefile.am
+++ b/tests/Makefile.am
@@ -426,6 +426,7 @@ TESTS =                                             \
   ls/dired                                     \
   ls/file-type                                 \
   ls/follow-slink                              \
+  ls/getxattr-speedup                          \
   ls/infloop                                   \
   ls/inode                                     \
   ls/m-option                                  \
diff --git a/tests/ls/getxattr-speedup b/tests/ls/getxattr-speedup
new file mode 100755
index 0000000..f3dc5bb
--- /dev/null
+++ b/tests/ls/getxattr-speedup
@@ -0,0 +1,60 @@
+#!/bin/sh
+# Show that we've eliminated most of ls' failing getxattr syscalls,
+# regardless of how many files are in a directory we list.
+# This test is skipped on systems that lack LD_PRELOAD support; that's fine.
+
+# Copyright (C) 2012 Free Software Foundation, Inc.
+
+# This program is free software: you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation, either version 3 of the License, or
+# (at your option) any later version.
+
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see <http://www.gnu.org/licenses/>.
+
+. "${srcdir=.}/init.sh"; path_prepend_ ../src
+print_ver_ ls
+
+fd=33
+
+# Replace each getxattr and lgetxattr call with a call to these stubs.
+# Each writes a single line to the specified FD so that we can later
+# determine the total number of calls.  FIXME: there must be a better way.
+cat > k.c <<'EOF' || framework_failure_
+#include <errno.h>
+#include <unistd.h>
+static void
+write_1 (void)
+{
+  int fd = @FD@;
+  write (fd, "x\n", 2);
+}
+ssize_t getxattr (const char *path, const char *name, void *value, size_t size)
+{ write_1 (); errno = ENOTSUP; return -1; }
+ssize_t lgetxattr(const char *path, const char *name, void *value, size_t size)
+{ write_1 (); errno = ENOTSUP; return -1; }
+EOF
+sed "s/@FD@/$fd/" k.c > k || framework_failure_
+mv k k.c || framework_failure_
+
+# Then compile/link it:
+$CC -fPIC -O2 -c k.c || framework_failure_ 'failed to compile with -fPIC'
+ld -G k.o -o k.so || framework_failure_ 'failed to invoke ld -G ...'
+
+# create a few files
+seq 100 | xargs touch || framework_failure_
+
+# Finally, run the test, redirecting
+eval "LD_PRELOAD=$PWD/k.so ls --color=always -l . $fd>x" || fail=1
+
+# Ensure that there were no more than 3 *getxattr calls.
+n_calls=$(wc -l < x)
+test "$n_calls" -le 3 || fail=1
+
+Exit $fail
--
1.7.9.1.232.g1a183



reply via email to

[Prev in Thread] Current Thread [Next in Thread]