bug-coreutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#60989: [PATCH] rm: fail on duplicate input if force not enabled


From: Pádraig Brady
Subject: bug#60989: [PATCH] rm: fail on duplicate input if force not enabled
Date: Sat, 21 Jan 2023 18:51:51 +0000
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:109.0) Gecko/20100101 Thunderbird/109.0

On 21/01/2023 15:51, Łukasz Sroka wrote:
On 21/01/2023 15:53, Pádraig Brady <P@draigbrady.com> wrote:

On 21/01/2023 13:05, Łukasz Sroka wrote:
      When the input files contain duplicates, then the rm fails. Because
      duplicates occur most often when the * is used and the shell unwraps it.
      There is a very common scenario when a user accidentally enters space
      after a filename, or enters space instead of forward slash.
      Example:

        rm prefix_ *

      The user intended to remove all files with a `prefix_` but removed all
      of the files in cwd.
      The program quits immediately when a duplicate is detected, to prevent
      pressing `y` because user expected a prompt regarding removing multiple
      files.
      The force option disables this function to enable scripts to work
      without modifying them.

```
diff --git a/src/rm.c b/src/rm.c
index 354e2b0df..e4f9949f0 100644
--- a/src/rm.c
+++ b/src/rm.c
@@ -123,6 +123,16 @@ diagnose_leading_hyphen (int argc, char **argv)
       }
   }

+static bool
+find_duplicates (int n_files, char **files)
+{
+  for (int l = 0; l < n_files-1; l++)
+    for (int r = l+1; r < n_files; r++)
+      if (strcmp(files[l], files[r]) == 0)
+          return true;
+  return false;
+}
+
   void
   usage (int status)
   {
@@ -211,6 +221,7 @@ main (int argc, char **argv)
     bool preserve_root = true;
     struct rm_options x;
     bool prompt_once = false;
+  bool force_rm = false;
     int c;

     initialize_main (&argc, &argv);
@@ -238,6 +249,7 @@ main (int argc, char **argv)
             x.interactive = RMI_NEVER;
             x.ignore_missing_files = true;
             prompt_once = false;
+          force_rm = true;
             break;

           case 'i':
@@ -352,6 +364,17 @@ main (int argc, char **argv)
     uintmax_t n_files = argc - optind;
     char **file =  argv + optind;

+  if (!force_rm && find_duplicates(n_files, file))
+    {
+      /* Because usually when the input files are duplicated it means
that the user
+         sumbitted both a directory and an * as separate arguments,
probably by accident */
+      fprintf (stderr,
+               "%s: input contains duplicates, most likely you've put "
+               "both * and a file from the same directory.\n",
+               program_name);
+      return EXIT_FAILURE;
+    }
+
     if (prompt_once && (x.recursive || 3 < n_files))
       {
         fprintf (stderr,
```

An interesting proposal.
The main protection would be for `dir/ *` rather than `file_prefix_ *`.
The former would be unusual for a user to type, while the latter more usual, 
but wouldn't trigger the protection AFAICS.
This ads O(N^2) on each interaction, so if it was to be included probably only 
enabled with --interactive.

cheers,
Pádraig

Yeah, true. Implemented it quickly and focused on the `dir/ *`
scanerio at first, because that has happened to me -.-
It was because I tried to do something quickly while editing command
from backsearch and ended up with `rm tmp/ *` instead of `rm tmp/*`.
To make the prefix_ protection it won't be possible without checking
if a "prefix" exits on disk (eg. file and file1), which would be more
time complex.
I don't like the --interactive approach, as when you're trying to do
something quickly, you probably won't opt for doing it interactively
and you're probably either disabled or ignoring shell warnings.

True, but at a high level -I can be seen as
an unintrusive "be more careful" option,
and so is often enabled by default with an alias.
Red Hat flavored systems even enable the more intrusive
-i option by default for the root user for example.

The more I think about this problem, the more I tilt towards
implementing it in shell. If you'd've written "rm prefix_ *" and had
10k files there, it would be so much easier to detect it on the shell
side than checking every filename with the other and triggering
syscalls for every one...
Either way if you have other solutions in mind, please update me, as
it wouldn't require to configure shell in dockers or remote machines,
thus making the solution more available.

A shell option like `shopt -s uniqueglob` would indeed be more general.
There are various other globbing options, like the similar "failglob"
which fails to execute the command if a glob doesn't match anything.

cheers,
Pádraig





reply via email to

[Prev in Thread] Current Thread [Next in Thread]