bug-coreutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#60989: [PATCH] rm: fail on duplicate input if force not enabled


From: Łukasz Sroka
Subject: bug#60989: [PATCH] rm: fail on duplicate input if force not enabled
Date: Sat, 21 Jan 2023 16:51:23 +0100

On 21/01/2023 15:53, Pádraig Brady <P@draigbrady.com> wrote:
>
> On 21/01/2023 13:05, Łukasz Sroka wrote:
> >      When the input files contain duplicates, then the rm fails. Because
> >      duplicates occur most often when the * is used and the shell unwraps 
> > it.
> >      There is a very common scenario when a user accidentally enters space
> >      after a filename, or enters space instead of forward slash.
> >      Example:
> >
> >        rm prefix_ *
> >
> >      The user intended to remove all files with a `prefix_` but removed all
> >      of the files in cwd.
> >      The program quits immediately when a duplicate is detected, to prevent
> >      pressing `y` because user expected a prompt regarding removing multiple
> >      files.
> >      The force option disables this function to enable scripts to work
> >      without modifying them.
> >
> > ```
> > diff --git a/src/rm.c b/src/rm.c
> > index 354e2b0df..e4f9949f0 100644
> > --- a/src/rm.c
> > +++ b/src/rm.c
> > @@ -123,6 +123,16 @@ diagnose_leading_hyphen (int argc, char **argv)
> >       }
> >   }
> >
> > +static bool
> > +find_duplicates (int n_files, char **files)
> > +{
> > +  for (int l = 0; l < n_files-1; l++)
> > +    for (int r = l+1; r < n_files; r++)
> > +      if (strcmp(files[l], files[r]) == 0)
> > +          return true;
> > +  return false;
> > +}
> > +
> >   void
> >   usage (int status)
> >   {
> > @@ -211,6 +221,7 @@ main (int argc, char **argv)
> >     bool preserve_root = true;
> >     struct rm_options x;
> >     bool prompt_once = false;
> > +  bool force_rm = false;
> >     int c;
> >
> >     initialize_main (&argc, &argv);
> > @@ -238,6 +249,7 @@ main (int argc, char **argv)
> >             x.interactive = RMI_NEVER;
> >             x.ignore_missing_files = true;
> >             prompt_once = false;
> > +          force_rm = true;
> >             break;
> >
> >           case 'i':
> > @@ -352,6 +364,17 @@ main (int argc, char **argv)
> >     uintmax_t n_files = argc - optind;
> >     char **file =  argv + optind;
> >
> > +  if (!force_rm && find_duplicates(n_files, file))
> > +    {
> > +      /* Because usually when the input files are duplicated it means
> > that the user
> > +         sumbitted both a directory and an * as separate arguments,
> > probably by accident */
> > +      fprintf (stderr,
> > +               "%s: input contains duplicates, most likely you've put "
> > +               "both * and a file from the same directory.\n",
> > +               program_name);
> > +      return EXIT_FAILURE;
> > +    }
> > +
> >     if (prompt_once && (x.recursive || 3 < n_files))
> >       {
> >         fprintf (stderr,
> > ```
>
> An interesting proposal.
> The main protection would be for `dir/ *` rather than `file_prefix_ *`.
> The former would be unusual for a user to type, while the latter more usual, 
> but wouldn't trigger the protection AFAICS.
> This ads O(N^2) on each interaction, so if it was to be included probably 
> only enabled with --interactive.
>
> cheers,
> Pádraig

Yeah, true. Implemented it quickly and focused on the `dir/ *`
scanerio at first, because that has happened to me -.-
It was because I tried to do something quickly while editing command
from backsearch and ended up with `rm tmp/ *` instead of `rm tmp/*`.
To make the prefix_ protection it won't be possible without checking
if a "prefix" exits on disk (eg. file and file1), which would be more
time complex.
I don't like the --interactive approach, as when you're trying to do
something quickly, you probably won't opt for doing it interactively
and you're probably either disabled or ignoring shell warnings.

The more I think about this problem, the more I tilt towards
implementing it in shell. If you'd've written "rm prefix_ *" and had
10k files there, it would be so much easier to detect it on the shell
side than checking every filename with the other and triggering
syscalls for every one...
Either way if you have other solutions in mind, please update me, as
it wouldn't require to configure shell in dockers or remote machines,
thus making the solution more available.





reply via email to

[Prev in Thread] Current Thread [Next in Thread]