bug-coreutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: rm -r sometimes produces errors under NFS


From: Vincent Lefevre
Subject: Re: rm -r sometimes produces errors under NFS
Date: Wed, 7 Mar 2007 00:38:22 +0100
User-agent: Mutt/1.5.14-vl-r16324 (2007-03-03)

On 2007-03-06 23:41:30 +0100, Jim Meyering wrote:
> Vincent Lefevre <address@hidden> wrote:
> > No need to store names: if it's the second pass, all the files have
> > already been unlinked.
> 
> Not necessarily.  Have you looked at the code?
> New files may have been added since the original opendir
> or since the most recent rewinddir.  We'd need some way
> to distinguish those new names from the ones we've already
> successfully unlinked.

But my point is that if there are new files, an unlink on them
shouldn't return an ENOENT error ("No such file or directory").

> >> > In fact, it isn't necessarily useful to remember anything.
> >> > When rm attempts to remove a file in a recurse phase,
> >> > no errors should be reported if the file doesn't exist.
> >>
> >> No.  Any POSIX-conforming rm implementation is required to
> >> report such errors, unless you specify -f.
> >
> > Wrong. In the recurse phase, if rm tries to unlink a file, this means
> > that the file has existed. So, this wouldn't be contrary to POSIX.
> 
> Your conclusion is invalid.
> What if some other process removed it first?

AFAIK, POSIX doesn't say that there should be an error in this
particular case, and IMHO, it is better and more consistent *not*
to return an error. Indeed, consider the following case:

1. "rm -r dir" is started.
2. A second process removes some file in some subdirectory of dir,
   and the "rm -r" process hasn't had the time to see it.
3. "rm -r" terminates (without any error).

Why would you want "rm -r" to return an error if some file is removed
by a second process between the time "rm -r" does the readdir and the
time unlink is performed on this file by "rm -r", but have no errors
in the case I've described above?

> If you're still convinced you have a case, you're going to have to
> start quoting the standard. I based my statement on what I know of
> POSIX, e.g., from this part of the rm specification:
> 
>       4. If the current file is a directory, ...
>       If the current file is not a directory, rm shall perform actions
>       equivalent to the unlink() function defined in the System
>       Interfaces volume of IEEE Std 1003.1-200x called with a pathname
>       of the current file used as the path argument. If this
>       fails for any reason, rm shall write a diagnostic message
>       to standard error, do nothing more with the current file,
>       and go on to any remaining files.

IMHO, the fact that unlink returns an ENOENT error in the recurse phase
*because of the implementation algorithm* should not be regarded as a
failure.

Also note that in the NFS case, the errors are due to the rewind, but
Point 2c for rm in POSIX is[*]:

  For each entry contained in file, other than dot or dot-dot, the
  four steps listed here (1 to 4) shall be taken with the entry as
  if it were a file operand. The rm utility shall not traverse
  directories by following symbolic links into other parts of the
  hierarchy, but shall remove the links themselves.

[*] http://www.opengroup.org/onlinepubs/009695399/utilities/rm.html
(I don't know if this is the latest version...)

So, if you want a strict interpretation of "For each entry", I don't
think a rewind is allowed. Otherwise the consequences should be taken
into account with care.

Also, this doesn't explain why the directory itself isn't removed
(after the "rm -r test", I get the errors, but an empty directory
remains).

> > But there's still a race condition (unrelated to NFS) in the rm code.
> 
> Something new?  Please give details.

This is precisely what I've said above:

1. rm sees the filename in the directory stream.
2. A second process removes the file.
3. rm does an unlink on the filename and gets an ENOENT error.

rm returns an error, but IMHO this is incorrect: Since at the time of
the unlink, the file doesn't exist, then it should not be regarded as
an entry of the directory (just as if the file were removed before rm
could see it in the recurse phase). Said otherwise, to decide of a
failure, rm should use the latest information available, and here the
latest information is given by the return code of unlink.

-- 
Vincent Lefèvre <address@hidden> - Web: <http://www.vinc17.org/>
100% accessible validated (X)HTML - Blog: <http://www.vinc17.org/blog/>
Work: CR INRIA - computer arithmetic / Arenaire project (LIP, ENS-Lyon)




reply via email to

[Prev in Thread] Current Thread [Next in Thread]