bug-findutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[bug #46305] Doing "find -L . -type d -delete" fails on symlinks to dire


From: raf
Subject: [bug #46305] Doing "find -L . -type d -delete" fails on symlinks to directories.
Date: Sat, 22 Oct 2022 18:01:26 -0400 (EDT)

Follow-up Comment #4, bug #46305 (project findutils):

Hi,

I was just about to submit a bug report about this, and found this existing
bug report. I think it's a shame that it has been set to WONTFIX. It's easy to
fix, and fixing it would be in keeping with the principle of least
astonishment.

Luckily, I noticed this in the notes:

> Please feel free to follow up if you disagree or you can see a
> simple solution. In the meantime I will work on some
> documentation updates to try to make things clearer.

So here goes. Here is my bug report with two (almost identical) solutions to
choose from, followed by more commentary to address the existing discussion:

> "find -L /path -delete" fails to delete symlinks to directories
> 
> When following symlinks with -H or -L, find sees a symlink to a
> directory as a directory, but it doesn't remember the fact that
> there was a symlink. So, when deleting, it uses rmdir rather
> than unlink (or their equivalents), which fails to remove the
> symlink, and that can break the overall deletion.
> 
> Steps to repeat:
> 
>   $ mkdir t dd dd/d
>   $ touch dd/d/f ff
>   $ ln -s ../dd t/ld
>   $ ln -s ../ff t/lf
>   $ find t dd ff -ls
>   ... t
>   ... t/lf -> ../ff
>   ... t/ld -> ../dd
>   ... dd
>   ... dd/d
>   ... dd/d/f
>   ... ff
>   $ find -L t -delete
>   find: cannot delete 't/ld': Not a directory
>   find: cannot delete 't': Directory not empty
>   $ find t dd ff -ls
>   ... t
>   ... t/ld -> ../dd
>   ... dd
>   ... ff
>   $ rm -r t dd ff
> 
> Expected output:
> 
>   $ find -L t -delete
>   $ find t dd ff -ls
>   find: 't': No such file or directory
>   ... dd
>   ... ff
> 
> The solution is either to remember the fact that there's a symlink, or
> check for it with lstat just before each deletion of a directory (if
> -H or -L), and use unlink rather than rmdir when it's a symlink. If
> the existence of the symlink is already being remembered (e.g., for
> -xtype), then the first solution is probably easier. Otherwise, the
> second solution is probably easier.

Now for some more commentary:

I think it's a mistake to think that the symlink's ultimate target directory
should be deleted when following symlinks, or that that is what the user is
expecting when they use -L/-H and -delete together. There is (necessarily) a
distinction between following symlinks for the purpose of file system
traversal versus following symlinks for the purpose of deletion.

The first is what -L does, and what the user expects. The second is impossible
(or ridiculously difficult/tedious/boring/errorprone outside the kernel, and
probably very unwise (think long chains of symlinks and race conditions)). If
any user does expect such behaviour, they just haven't thought about it
enough. If they did, they'd realise that it's not an option. So that's no
reason not to fix this.

And it's not just about directories. When following symlinks and deleting, if
there's a symlink to a non-directory, it's the symlink that gets deleted, not
the symlink's ultimate target. The example in the above bug report
demonstrates that. The same should apply to symlinks to directories, but it
doesn't. Consistency is a reason to fix this.

Unless this is fixed (by switching from rmdir to unlink when trying to delete
a symlink), find fails to do what the user wants, and forces them to resort to
a subsequent rm command. That's less than awesome.

The find(1) manual entry refers to "confusing behaviour", in relation to -L
and -delete, but it's only confusing because it hasn't been explained to the
user. That's easy to do as well. I've just written a find-alternative, and
I've documented the behaviour like this (adapted for find terminology):

  When -delete is used with the -L or -H option,
  and a symlink to a directory is followed, the
  symlink's ultimate target directory's contents
  are searched, and any matches found there are
  deleted, but the target directory itself is
  never deleted. It isn't possible to delete a
  filesystem entry via a symlink to it. If the
  directory itself matches the search criteria,
  the symlink is deleted. Similarly, when a
  symlink to a non-directory is followed, and
  the symlink's ultimate target matches the search
  criteria, the symlink is deleted, not the
  ultimate target.

That should prevent any confusion arising (and probably be reassuring). It
explains what the behaviour is and the reason for it being what it is. Clearly
describing the behaviour should be enough to set the user's expectations (at
least of those that read the documentation). You're welcome to
incorporate/adapt the above text into find's manual entry, but it really
belongs in the -delete section, rather than the -L section.

As for the issue of symlinks becoming broken during the deletion operation, I
don't think that affects this at all. If find(1) thinks that it needs to
delete a directory, because it's remembering something that is no longer true,
it doesn't matter, because it's only the symlink itself that needs to be
deleted anyway, and the symlink still exists and needs to be deleted, whether
it's broken or not. If the comment in the manual entry about "confusing
behaviour" was only about this, it could probably be removed. But perhaps it
relates to something else.

Anyway, that's what I think. I hope you'll reconsider the WONTFIX status.

I'm submitting a patch that fixes it and documents and tests the behaviour. In
pred_delete(), just after it switches to rmdir when unlink fails because it's
a directory, I've added a check for rmdir failing because it's not a
directory, and have it switch to unlink. That's all that's needed. It's a very
small, simple change.

I assume that "Copyright-paperwork-exempt: Yes" because, even though the patch
is more than 10 lines (but only if you include docs and tests), the 10 line
code addition was copied and pasted from the 10 lines above, and 1 line of
code and 1 line of comments were changed, so you could see it as only 1 or 2
new lines of code. If that's not how it works, let me know.



    _______________________________________________________

Reply to this item at:

  <https://savannah.gnu.org/bugs/?46305>

_______________________________________________
Message sent via Savannah
https://savannah.gnu.org/




reply via email to

[Prev in Thread] Current Thread [Next in Thread]