[PATCH] rm, du, chmod, chown, chgrp: use much less memory for large dire

From: Jim Meyering
Subject: [PATCH] rm, du, chmod, chown, chgrp: use much less memory for large directories
Date: Fri, 19 Aug 2011 18:24:46 +0200

[Cc'ing bug-findutils, since it needs the same treatment]

Applied to a directory containing too many entries, tools like rm, du,
chmod, chown, chgrp and chcon would exhaust virtual memory.
At ~1GiB per 4 million entries, you can calculate how many
entries it would take to cause trouble in your environment.

This change updates coreutils to use the just-fixed version
of fts (the dir-traversal code) from gnulib.  With it,
the fts-internal memory utilization is capped at about 30MB.
It also adds a test that is marked as very expensive.

To run just this test, you can do this:

  make check -C tests TESTS=rm/4-million-entry-dir VERBOSE=yes \

I'll wait at least a couple hours before pushing this.

>From 0ba576979a10a11e5652fd155266464b1e784892 Mon Sep 17 00:00:00 2001
From: Jim Meyering <address@hidden>
Date: Fri, 19 Aug 2011 17:51:45 +0200
Subject: [PATCH] rm, du, chmod, chown, chgrp: use much less memory for large

For details, see the gnulib commit,
* tests/rm/4-million-entry-dir: New test.
* tests/ (TESTS): Add it.
* NEWS (Bug fixes): Mention it.
* gnulib: Update to latest to get the required fts fixes.
 NEWS                         |    9 +++++++++
 gnulib                       |    2 +-
 tests/            |    1 +
 tests/rm/4-million-entry-dir |   35 +++++++++++++++++++++++++++++++++++
 4 files changed, 46 insertions(+), 1 deletions(-)
 create mode 100755 tests/rm/4-million-entry-dir

diff --git a/NEWS b/NEWS
index 6e24f5c..b356a03 100644
--- a/NEWS
+++ b/NEWS
@@ -17,6 +17,15 @@ GNU coreutils NEWS                                    -*- 
outline -*-
   to dst/s/b rather than simply linking dst/s/b to dst/s/a.
   [This bug appears to have been present in "the beginning".]

+  fts-using tools (rm, du, chmod, chgrp, chown, chcon) no longer use memory
+  proportional to the number of entries in each directory they process.
+  Before, rm -rf 4-million-entry-directory would consume about 1GiB of memory.
+  Now, it uses less than 30GB, no matter how many entries there are.
+  [this bug was inherent in the use of fts: thus, for rm the bug was
+  introduced in coreutils-8.0.  The prior implementation of rm did not use
+  as much memory.  du, chmod, chgrp and chown started using fts in 6.0.
+  chcon was added in coreutils-6.9.91 with fts support.  ]
   printf '%d' '"' no longer accesses out-of-bounds memory in the diagnostic.
   [bug introduced in sh-utils-1.16]

diff --git a/gnulib b/gnulib
index d2b8ab6..47cb657 160000
--- a/gnulib
+++ b/gnulib
@@ -1 +1 @@
-Subproject commit d2b8ab669f3129ac0d349eead1217adc38d795eb
+Subproject commit 47cb657eca1abf2c26c32c8ce03def994a3ee37c
diff --git a/tests/ b/tests/
index eb67512..f0200e1 100644
--- a/tests/
+++ b/tests/
@@ -135,6 +135,7 @@ TESTS =                                             \
   rm/unread3                                   \
   rm/unreadable                                        \
   rm/v-slash                                   \
+  rm/4-million-entry-dir                       \
   chgrp/default-no-deref                       \
   chgrp/deref                                  \
   chgrp/no-x                                   \
diff --git a/tests/rm/4-million-entry-dir b/tests/rm/4-million-entry-dir
new file mode 100755
index 0000000..23130a6
--- /dev/null
+++ b/tests/rm/4-million-entry-dir
@@ -0,0 +1,35 @@
+# in coreutils-8.12, this would have required ~1GB of memory
+# Copyright (C) 2011 Free Software Foundation, Inc.
+# This program is free software: you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation, either version 3 of the License, or
+# (at your option) any later version.
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# GNU General Public License for more details.
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see <>.
+. "${srcdir=.}/"; path_prepend_ ../src
+print_ver_ rm
+# Put 4M files in a directory.
+mkdir d && cd d || framework_failure_
+seq 4000000|xargs touch || framework_failure_
+cd ..
+# Restricted to 50MB, rm from coreutils-8.12 would fail with a
+# diagnostic like "rm: fts_read failed: Cannot allocate memory".
+ulimit -v 50000
+rm -rf d || fail=1
+Exit $fail

