bug-findutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[bugs #3998] updatedb: exists early with ". changed during execution"


From: Steve Revilak
Subject: [bugs #3998] updatedb: exists early with ". changed during execution"
Date: Wed, 24 Nov 2004 14:12:57 -0500
User-agent: Mozilla/5.0 (Macintosh; U; PPC Mac OS X; en) AppleWebKit/125.5.5 (KHTML, like Gecko) Safari/125.11

This mail is an automated notification from the bugs tracker
 of the project: findutils.

/**************************************************************************/
[bugs #3998] Latest Modifications:

Changes by: 
                Steve Revilak <address@hidden>
'Date: 
                Wed 11/24/2004 at 19:06 (GMT)

------------------ Additional Follow-up Comments ----------------------------
Here's what I've observed with findutils-4.2.8 on Solaris 8.

$ find /AUTOMOUNT-POINT
# (where /AUTOMOUNT-POINT is a subdirectory of
#  /ROOT-DIRECTORY-OF-AUTOMOUNTER-MAP)
  * if AUTOMOUNT-POINT was mounted, find descends into directory
  * if AUTOMOUNT-POINT was not mount, find triggers mount, prints
    warning ("Warning: filesystem ... has recently been mounted."),
    descends into directory

$ find /ROOT-DIRECTORY-OF-AUTOMOUNTER-MAP
  * descends into subdirs for mount points that are already mounted
  * lists directories of mount point subdirs that are not mounted

$ find /ROOT-DIRECTORY-OF-AUTOMOUNTER-MAP -follow
  * descends into all mount point subdirs, triggering mounts as needed.
    Does not emit warnings.








/**************************************************************************/
[bugs #3998] Full Item Snapshot:

URL: <http://savannah.gnu.org/bugs/?func=detailitem&item_id=3998>
Project: findutils
Submitted by: Steve Revilak
On: Mon 06/16/2003 at 19:23

Category:  find
Severity:  5 - Average
Item Group:  None
Resolution:  Fixed
Privacy:  Public
Assigned to:  None
Originator Name:  
Originator Email:  
Status:  Open
Release:  4.2.0
Fixed Release:  4.2.8


Summary:  updatedb: exists early with ". changed during execution"

Original Submission:  I'd like to report a problem with findutils-4.1.20 on 
Solaris 7.

I think that a comparison is the easiest way to illustrate.  For this,
I'm using an updatedb script and find binary that have been slightly
modified to provide additional debugging information  (Diffs are
appended, so that you can see exactly what's changed).

First - just a simple execution of find, run from the root of the
filesystem:

  $ time sudo /usr/local/src/findutils-4.1.20/find/find / | wc
  ^C
  real    419m54.387s
  user    3m52.060s
  sys     34m3.240s

If you factor in nfs shares, / a big filesystem on this machine.  I
expected the above to take awhile, but got impatient.  Let's try
again, sticking to a single partition:

  $ time sudo /usr/local/src/findutils-4.1.20/find/find /  -xdev | wc
   105906  105906 5109425

  real    0m18.742s
  user    0m1.430s
  sys     0m2.890s

Above, find found 105,906 files in about 18 seconds.  Remember these
numbers.


Next - try to run updatedb without any arguments.  This should find a
similar number of files (modulo the default set of prunepaths).

  $ time sudo /u/srevilak/updatedb
  /usr/local/src/findutils-4.1.20/find/find: find.c:528: . changed during 
execution of /usr/local/src/findutils-4.1.20/find/find
  /u/srevilak/updatedb: debug: entries in /tmp/debug-updatedb: 5

  real    0m1.572s
  user    0m0.060s
  sys     0m0.080s

updatedb comes up with only 5 files, and finishes in under 2 seconds.


To summarize, something in the updatedb invocation of find is causing
it to bail early.  Looking around find.c:528, we have

      if (stat_buf.st_dev != dir_ids[dir_curr].dev ||
          stat_buf.st_ino != dir_ids[dir_curr].ino)
        error (1, 0, _("%s changed during execution of %s"), starting_dir, 
program_name);


The test seems entirely reasonable, and under some invocations it
never returns positive.  Perhaps one of the predicates in the updatedb
invocation isn't taking care of dir_ids properly?

I did notice `[ Bug #3786 ] updatedb fails if it cannot access "." as
the user to whom it switches uid.', but this appears to be something
different.  "updatedb --localuser=root --netuser=root" fails in the
same way.

Anyway, if you guys could look into this, I'd really appreciate it.
If you need any additional information, please let me know and I'll be
happy to provide it.  If I happen to get any further insight, I'll be
sure to pass it along.


Thanks much!

address@hidden


==================================================================
Aforementioned diffs follow:

==================================================================
updatedb diff.  
Change: tee output to a temporary file and say how many files appeared

--- /usr/local/gnu/bin/updatedb 2003-06-06 08:13:25.000000000 -0400
+++ /u/srevilak/updatedb        2003-06-16 07:34:48.938002000 -0400
@@ -100,6 +100,8 @@
 : ${bigram=${LIBEXECDIR}/bigram}
 : ${code=${LIBEXECDIR}/code}
 
+find=/usr/local/src/findutils-4.1.20/find/find
+
 PATH=/bin:/usr/bin:${BINDIR}; export PATH
 
 : ${PRUNEFS="nfs NFS proc"}
@@ -143,7 +145,8 @@
     $find $NETPATHS ( -type d -regex "$PRUNEREGEX" -prune ) -o -print
   fi
 fi
-} | sort -f | $frcode > $LOCATE_DB.n
+} | sort -f | tee /tmp/debug-updatedb | $frcode > $LOCATE_DB.n
+echo "$0: debug: entries in /tmp/debug-updatedb: `grep -c . 
/tmp/debug-updatedb`"
 
 # To avoid breaking locate while this script is running, put the
 # results in a temp file, then rename it atomically.


==================================================================
find.c diff.
Change: add __FILE__ and __LINE__ to each error() call containing the
text "changed during execution"

--- find.c.ORIG 2003-05-24 14:36:25.000000000 -0400
+++ find.c      2003-06-16 07:32:57.067319000 -0400
@@ -304,7 +304,7 @@
        error (1, errno, "%s", starting_dir);
       if (stat_buf.st_dev != starting_stat_buf.st_dev ||
          stat_buf.st_ino != starting_stat_buf.st_ino)
-       error (1, 0, _("%s changed during execution of %s"), starting_dir, 
program_name);
+       error (1, 0, _("%s:%d: %s changed during execution of %s"), __FILE__, 
__LINE__, starting_dir, program_name);
     }
   else
     {
@@ -341,7 +341,7 @@
        error (1, errno, "%s", pathname);
       if (cur_stat_buf.st_dev != stat_buf.st_dev ||
          cur_stat_buf.st_ino != stat_buf.st_ino)
-       error (1, 0, _("%s changed during execution of %s"), pathname, 
program_name);
+       error (1, 0, _("%s:%d: %s changed during execution of %s"), __FILE__, 
__LINE__, pathname, program_name);
 
       process_path (pathname, ".", false, ".");
       chdir_back ();
@@ -525,7 +525,7 @@
        error (1, errno, "%s", pathname);
       if (stat_buf.st_dev != dir_ids[dir_curr].dev ||
          stat_buf.st_ino != dir_ids[dir_curr].ino)
-       error (1, 0, _("%s changed during execution of %s"), starting_dir, 
program_name);
+       error (1, 0, _("%s:%d: %s changed during execution of %s"), __FILE__, 
__LINE__, starting_dir, program_name);
 
       for (namep = name_space; *namep; namep += file_len - pathname_len + 1)
        {
@@ -589,9 +589,9 @@
              (dir_curr > 0 ? dir_ids[dir_curr-1].ino : 
starting_stat_buf.st_ino))
            {
              if (dereference)
-               error (1, 0, _("%s changed during execution of %s"), parent, 
program_name);
+               error (1, 0, _("%s:%d: %s changed during execution of %s"), 
__FILE__, __LINE__, parent, program_name);
              else
-               error (1, 0, _("%s/.. changed during execution of %s"), 
starting_dir, program_name);
+               error (1, 0, _("%s:%d: %s/.. changed during execution of %s"), 
__FILE__, __LINE__, starting_dir, program_name);
            }
        }
 


Follow-up Comments
------------------


-------------------------------------------------------
Date: Wed 11/24/2004 at 19:06       By: Steve Revilak <srevilak>
Here's what I've observed with findutils-4.2.8 on Solaris 8.

$ find /AUTOMOUNT-POINT
# (where /AUTOMOUNT-POINT is a subdirectory of
#  /ROOT-DIRECTORY-OF-AUTOMOUNTER-MAP)
  * if AUTOMOUNT-POINT was mounted, find descends into directory
  * if AUTOMOUNT-POINT was not mount, find triggers mount, prints
    warning ("Warning: filesystem ... has recently been mounted."),
    descends into directory

$ find /ROOT-DIRECTORY-OF-AUTOMOUNTER-MAP
  * descends into subdirs for mount points that are already mounted
  * lists directories of mount point subdirs that are not mounted

$ find /ROOT-DIRECTORY-OF-AUTOMOUNTER-MAP -follow
  * descends into all mount point subdirs, triggering mounts as needed.
    Does not emit warnings.



-------------------------------------------------------
Date: Wed 11/24/2004 at 15:06       By: James Youngman <jay>
I have uploaded findutils-4.2.8 to alpha.gnn.org. It keeps track of device 
numbers rather than mount point names, and includes changes designed to cope 
with stacked automount filesystems where all of /a/b/c/d are mounted in turn by 
automountd as find descends the tree.  It turns out that the nastiest cases 
were in the FNS tree under /xfn.

Could you test this to see if it works for you please? 

-------------------------------------------------------
Date: Tue 11/23/2004 at 15:08       By: Steve Revilak <srevilak>
Tested with findutils-4.2.7 on Solaris 8.  Seems to work.

# 4.2.7
$ ~/findutils/build/bin/find -version     
GNU find version 4.2.7

# /data/DF is an automounted directory.  It is not mounted 
# when the command line below is executed:
$ ~/findutils/build/bin/find /data/DF | wc -l
/home/srevilak/findutils/build/bin/find: Warning: filesystem /data/DF has 
recently been mounted.
112

# subsequent invocation should the same number of files
# (but no warning message, as the /data/DF was mounted
# when find was invoked)



-------------------------------------------------------
Date: Sun 11/21/2004 at 23:32       By: James Youngman <jay>
I believe that thsi problem has been addressed in findutils-4.2.7 in a way that 
should work on Solaris.

-------------------------------------------------------
Date: Sun 11/21/2004 at 17:52       By: Steve Revilak <srevilak>
I didn't have much luck with the patch. (on Solaris 8/Generic_108528-29,
using http://savannah.gnu.org/bugs/download.php?item_id=3998&item_file_id=1901)


make[3]: Entering directory `/home/srevilak/findutils/findutils-4.2.6/find'
if gcc -DHAVE_CONFIG_H -I. -I. -I.. -I../gnulib/lib -I../lib -I../gnulib/lib 
-I../intl -DLOCALEDIR="/home/srevilak/findutils/build/share/locale"    -g -O2 
-MT find.o -MD -MP -MF ".deps/find.Tpo" -c -o find.o find.c; 
then mv -f ".deps/find.Tpo" ".deps/find.Po"; else rm -f ".deps/find.Tpo"; exit 
1; fi
if gcc -DHAVE_CONFIG_H -I. -I. -I.. -I../gnulib/lib -I../lib -I../gnulib/lib 
-I../intl -DLOCALEDIR="/home/srevilak/findutils/build/share/locale"    -g -O2 
-MT fstype.o -MD -MP -MF ".deps/fstype.Tpo" -c -o fstype.o fstype.c; 
then mv -f ".deps/fstype.Tpo" ".deps/fstype.Po"; else rm -f ".deps/fstype.Tpo"; 
exit 1; fi
fstype.c: In function `get_mounted_filesystems':
fstype.c:393: `MOUNTED' undeclared (first use in this function)
fstype.c:393: (Each undeclared identifier is reported only once
fstype.c:393: for each function it appears in.)
fstype.c:401: warning: assignment makes pointer from integer without a cast
fstype.c:405: warning: assignment makes pointer from integer without a cast
fstype.c:414: dereferencing pointer to incomplete type
fstype.c:416: dereferencing pointer to incomplete type
make[3]: *** [fstype.o] Error 1
make[3]: Leaving directory `/home/srevilak/findutils/findutils-4.2.6/find'
make[2]: *** [all-recursive] Error 1
make[2]: Leaving directory `/home/srevilak/findutils/findutils-4.2.6/find'
make[1]: *** [all-recursive] Error 1
make[1]: Leaving directory `/home/srevilak/findutils/findutils-4.2.6'
make: *** [all] Error 2


Looking at fstype.c (lines 55 - 65), Solaris 8 has no macro for
for MNT_MNTTAB, nor is there one from MNTTABNAME)

Solaris 8's /usr/include/sys/mnttab.h has

  #define MNTTAB  "/etc/mnttab"
  extern int      getmntent(FILE *, struct mnttab *);
  (there is no setmntent, or endmntent)

http://docs.sun.com/app/docs/doc/806-0627/6j9vhfmtb?a=view

Which would make the guts of get_mounted_filesystems() pretty
different.


##################################################################

# For kicks and grins, gdb, with an unpatched fstypes.c (from
# findutils-4.2.6, compiled without -O2).  At the very least, it seems to show
# that get_mounted_filesystems is the culprit.

671           if (TraversingDown == direction)
(gdb) print specific_what
$5 = 0x3e6d8 "/data/DF"
(gdb) p direction
$6 = TraversingDown
(gdb) n
688               enum MountPointStateChange transition = 
get_mount_point_state(what);
(gdb) bt
#0  wd_sanity_check (thing_to_stat=0xffbef4a8 "/data/DF", 
    program_name=0xffbef480 "/home/srevilak/findutils/build/bin/find", 
what=0xffbef4a8 "/data/DF", 
    oldinfo=0xffbef1b0, newinfo=0xffbef118, parent=0, line_no=807, 
direction=TraversingDown) at find.c:688
#1  0x0001429c in process_top_path (pathname=0xffbef4a8 "/data/DF") at 
find.c:806
#2  0x00013a44 in main (argc=2, argv=0xffbef34c) at find.c:496

(gdb) n
689               switch (transition)
(gdb) print transition
$7 = MountPointStateUnchanged  <<<<<<<<<<<<<


This seems to come from 

  # find.c: 604
  mount_points = get_mounted_filesystems();

when get_mounted_filesystems() returns, mount_points is a null pointer


Backtrace when "mount_points = get_mounted_filesystems()" is excuted

#0  get_mount_point_state (dir=0xffbef4a8 "/data/help") at find.c:609
#1  0x00013de4 in wd_sanity_check (thing_to_stat=0xffbef4a8 "/data/help", 
    program_name=0xffbef480 "/home/srevilak/findutils/build/bin/find", 
what=0xffbef4a8 "/data/help", 
    oldinfo=0xffbef1b0, newinfo=0xffbef118, parent=0, line_no=807, 
direction=TraversingDown) at find.c:688
#2  0x0001429c in process_top_path (pathname=0xffbef4a8 "/data/help") at 
find.c:806
#3  0x00013a44 in main (argc=2, argv=0xffbef34c) at find.c:496


##################################################################



-------------------------------------------------------
Date: Sun 11/21/2004 at 15:37       By: James Youngman <jay>
Could you let me know if the attached patch fixes the problem?   I suspect that 
it might.  

If not, could you try using a debugger to figure out what wd_sanity_check() 
thinks is happening?


-------------------------------------------------------
Date: Sun 11/21/2004 at 15:11       By: Steve Revilak <srevilak>
I built 4.2.6.  Unfortunately, it still seems to have the problem. :(

  $ find --version
  GNU find version 4.2.6

  # try one of yesterday's tests.  /data/DF is not mounted.
  $ find /data/DF
  find: /data/DF changed during execution of find (old device number 74973186, 
new device number 74711528, filesystem type is nfs) [ref 807]
  [no files listed]

  # after mount, succeeds
  $ find /data/DF | wc
    112     112    1762

                                * * *


I had one other silly idea for dealing with the device number
changes.  It's based on a couple of observations:

  # mount point and device number of /
  $ df /     
  Filesystem           1K-blocks      Used Available Use% Mounted on
  /dev/dsk/c0t0d0s0      6191949   5397663    732367  89% /

  $ perl -e 'print((stat("/"))[0] . "n")'
  35651584


  # mount point and device number of /data
  $ df /data
  Filesystem           1K-blocks      Used Available Use% Mounted on
  auto_data                    0         0         0   -  /data

  $ perl -e 'print((stat("/data"))[0] . "n")'
  74973186  


  # mount point and device number of /data/help
  # (matches device number of parent)
  $ df /data/help
  Filesystem           1K-blocks      Used Available Use% Mounted on
  auto_data                    0         0         0   -  /data

  $ perl -e 'print((stat("/data/help"))[0] . "n")'
  74973186


  # mount point and device number of /data/help/.
  # (st_dev changes)
  $ df /data/help/.
  Filesystem           1K-blocks      Used Available Use% Mounted on
  u12:/disk/sd0h/data/help
                         2166126   1216044    906760  58% /data/help

  $ perl -e 'print((stat("/data/help/."))[0] . "n")'
  74711515
  

Now, work our way back up the directory tree:

  # st_dev matches /data/help/.
  $ perl -e 'print((stat("/data/help"))[0] . "n")'
  74711515

  # matches orginal st_dev for /data
  $ perl -e 'print((stat("/data"))[0] . "n")'
  74973186

  # matches original st_dev for /
  $ perl -e 'print((stat("/"))[0] . "n")'
  35651584


Once inside an automounted directory, the device number shouldn't
change; but when descending, it can change _once_ (if a mount occurs).


So, the silly idea:

  * maintain a stack of device numbers.  When entering a new device,
    push st_dev onto the stack.  When leaving a device, pop st_dev
    from the stack

  * Permit the device number of D to change if

    - D is a directory
    - direction is TraversingDown
    - the old device number for D is on top of the stack
    - D has a parent directory
    - the device number of D's parent is on top of the stack



-------------------------------------------------------
Date: Sun 11/21/2004 at 13:50       By: James Youngman <jay>
You can download a release of findutils in which this problem is
fixed from ftp://alpha.gnu.org/gnu/findutils.

The releases on alpha.gnu.org are for testing purposes, so please
take the time to download the release and verify that your
problem has been solved.  Once the release has been sufficiently
tested, it can be uploaded to ftp.gnu.org for everybody to use it.


-------------------------------------------------------
Date: Sat 11/20/2004 at 22:53       By: James Youngman <jay>
I think I have a solution.  It is to memorise the paths listed in /etc/mnttab 
at startup.  Then when I chdir into a directory and the device number changes, 
AND it is not in the saved copy of mnttab, BUT it is in the new copy (that I 
re-read only when this error occurs), then the directory must have been 
automounted.  Otherwise, I flag the situation as an error and quit.   This 
seems inelegant but workable.

-------------------------------------------------------
Date: Sat 11/20/2004 at 19:57       By: Steve Revilak <srevilak>
I had a chance to run some tests with findutils-4.2.5.  Here's what I
found.

  * Caveat: it is not possible for me to do testing on the systems
    where the problem was originally observed.  Instead, I was able
    to use a `similar' environment (Solaris 8, with some filesystems
    mounted via nfs).

  * The original problem (updatedb exiting immediately) was not
    reproduced.  However, this may be a result of the order in which
    file system directories was traversed.  (i.e. - it the problem
    spots sooner rather than later).

  * I saw sporadic instances of the `changed during execution'
    phenemon noted earlier.

    /home/srevilak/findutils/build/bin/find: /data/DF changed during execution 
of /home/srevilak/findutils/build/bin/find (old device number 74973186, new 
device number 74712136, filesystem type is nfs) [ref 636]

   /home/srevilak/findutils/build/bin/find: /sources/DOWNLOADS changed during 
execution of /home/srevilak/findutils/build/bin/find (old device number 
74973190, new device number 74712147, filesystem type is nfs) [ref 636]

   /home/srevilak/findutils/build/bin/find: /data/DF changed during execution 
of /home/srevilak/findutils/build/bin/find (old device number 74973186, new 
device number 74712311, filesystem type is nfs) [ref 817]


    In this case, both directories are the roots of nfs filesystems,
    mounted by Sun's automountd.

Now, here's the thing that I found most interesting.

On this machine, /data/ is a directory that serves as the root of an
automounter map.  Starting from a state where none of the /data/
subdirectories show up in the output of df:

  # make sure we're using the right find
  $ type find
  find is hashed (/home/srevilak/findutils/build/bin/find)

  # try `find /data'
  $  find /data | wc
     find: /data/DF changed during execution of find (old device number 
74973186, new device number 74712431, filesystem type is nfs) [ref 817]
        2       2      15

  # okay try again
  $ find /data | wc
       51      51     762

  # demonstrate that /data/ does not contain symlinks
  $ find /data -type l | wc
        0       0       0

  # demonstrate that /data/ contains directories
  $ find /data -type d | wc
       51      51     762

  # same as above, but hit the subdirectories directly
  $ find /data/* -type d | wc
  find: /data/9202doc changed during execution of find (old device number 
74973186, new device number 74712435, filesystem type is nfs) [ref 629]
        0       0       0

  # okay, try again
  $ find /data/* -type d | wc
  find: /data/DU changed during execution of find (old device number 74973186, 
new device number 74712436, filesystem type is nfs) [ref 629]
      235     235    8334

  # Lets try to hit one of the sub directories that isn't mounted
  $ find /data/texmf -type f | wc
  find: /data/texmf changed during execution of find (old device number 
74973186, new device number 74712440, filesystem type is nfs) [ref 629]
        0       0       0

  # Before executing the above, /data/texmf *did not* appear in the
  # machine's mount table.
  #
  # After executing the avove, /data/texmf *did* appear in the
  # machine's mount table
  
  # when mounted, it works
  $ find /data/texmf -type f | wc
     3318    3318  190539

If you'd like to me to try any other tests, let me know.


-------------------------------------------------------
Date: Sat 11/20/2004 at 01:35       By: 0 <None>
Sorry - I received the mail notification for the 11/19 update, but not for the 
10/31, 11/08 updates.  Either I missed them, or my ISPs spam filter ate them 
for lunch :(  I'll make sure that doesn't happen in the future.

I'll try to do a test over the weekend.  

-------------------------------------------------------
Date: Fri 11/19/2004 at 19:37       By: James Youngman <jay>
Did you get a chance to do that test?

-------------------------------------------------------
Date: Mon 11/08/2004 at 21:52       By: James Youngman <jay>
I'm marking this bug postponed for now, but will look at it again when I get 
the diagnostic info from the test with findutils-4.2.3 (or later).  

-------------------------------------------------------
Date: Sun 10/31/2004 at 17:29       By: James Youngman <jay>
findutils 4.2.3 (available on ftp://alpha.gnu.org/gnu/findutils) includes an 
enhanced diagnostic for this scenario.  Could you re-test with that, please?  
Thanks, 
James.


-------------------------------------------------------
Date: Mon 06/16/2003 at 19:26       By: Steve Revilak <srevilak>
I hit the submit button a little too soon: "exists" in the Summary line should 
be "exits".  Sorry :(




CC List
-------

CC Address                          | Comment
------------------------------------+-----------------------------
levon --AT-- movementarian --DOT-- org | 



File Attachments
-------------------

-------------------------------------------------------
Date: Sun 11/21/2004 at 15:37  Name: getmntent.patch  Size: 515B   By: jay
Fix for a stupid bug introduced in 4.2.6.
http://savannah.gnu.org/bugs/download.php?item_id=3998&amp;item_file_id=1901






For detailed info, follow this link:
<http://savannah.gnu.org/bugs/?func=detailitem&item_id=3998>

_______________________________________________
  Message sent via/by Savannah
  http://savannah.gnu.org/







reply via email to

[Prev in Thread] Current Thread [Next in Thread]