bug-coreutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#29832: Bug in 'ls -FQ': incorrectly quoted characters


From: Pádraig Brady
Subject: bug#29832: Bug in 'ls -FQ': incorrectly quoted characters
Date: Sun, 24 Dec 2017 12:26:30 +0000
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.8.0

On 24/12/17 07:14, address@hidden wrote:
> Hello!
> 
> Thank you for your continued improvements and maintenance of 'ls' in
> coreutils-8.28.
> 
> I have found a somewhat subtle bug. I've included a patch and a test script
> to make sure there are no regressions.
> 
> In short, the problem is that a backlash is incorrectly put in front of the
> characters @, =, >, and |. This only happens when using filetype indicators
> and C style quoting.
> 
>   $ touch @
>   $ ls
>   @
>   $ ls -F
>   @
>   $ ls -Q
>   "@"
>   $ ls -FQ
>   "\@"        # <-- incorrect
> 
> There should be no backslash before the @.
> 
> This was a little bit tricky to track down as the code responsible is not
> commented and appears to have been prematurely optimized by the programmer:
> 
>   ls.c:2141
>   if (file_type <= indicator_style)
>     {
>       char const *p;
>       for (p = &"*=>@|"[indicator_style - file_type]; *p; p++)
>         set_char_quoting (filename_quoting_options, *p, 1);
>     }
> 
> Basically, it says, if --file-type was used, then we should be sure to
> quote '=', '>', and '@'. If --classify is used, then we should do the same
> but also add in '*'. Or, to put it in less compressed C:
> 
>   if (indicator_style == file_type)
>     {
>       char const *p;
>       for (p = "=>@|"; *p; p++)
>         set_char_quoting (filename_quoting_options, *p, 1);
>     }
>   if (indicator_style == classify)
>     {
>       char const *p;
>       for (p = "*=>@|"; *p; p++)
>         set_char_quoting (filename_quoting_options, *p, 1);
>     }
> 
> However, there is no need to quote those characters when the quoting-style
> is set to C. In fact, it is an error to do so as these are not valid C
> escape sequences.
> 
> I am guessing the idea of the code was that those characters would not be
> backslash escaped, but rather the entire filename would be wrapped in
> quotes so the indicator character cannot be confused for part of the
> filename. And, indeed, this does work for some of the quoting styles:
> "shell[-escape][-always]" work that way. Strangely, "c-maybe" also works
> that way.
> 
>   $ ls -F --quoting-style=literal
>   @
>   $ ls -F --quoting-style=shell
>   '@'
>   $ ls -F --quoting-style=shell-always
>   '@'
>   $ ls -F --quoting-style=shell-escape
>   '@'
>   $ ls -F --quoting-style=shell-escape-always
>   '@'
>   $ ls -F --quoting-style=c-maybe
>   "@"        # <-- This is correct!
> 
> I spent some time debugging it, but I didn't find the perfect fix as I
> don't understand why it was written the way it was. Rather, I found
> something that I think is an improvement and a step towards doing things
> right. I've made a patch that simply checks if the quoting style is C and
> skips the problematic lines. Ideally, somebody who understands why this
> code was written this way in the first place can make a better patch, but
> this should work for now and not break anything for anybody else.
> 
> Please find attached below, my patch for src/ls.c and a regression test
> script to be placed in tests/ls/quoting-style-c.sh.
> 
> --b9
> 
> P.S. This bug may also apply to the -b (--quoting-style=escape). However,
> the output from it appears to be more like a mish-mash of C and shell
> escapes. (For example, tabs are shown as "\t", as in C, but spaces are
> shown as "\ ", as in bourne shell. ) The -Q option, on the other hand,
> generates perfectly acceptable C strings (tested with gcc and python),
> except for those few buggy characters fixed in this patch.

Thanks for the detailed analysis.

c-maybe is correct here as quotearg explicitly ignores these extra chars
to be quoted when outer quotes are added.  Therefore we should
probably ignore these extra chars to be quoted when we're adding
outer quotes unconditionally.

This gnulib patch would handle this specific case.
I'll look into generalizing for all outer quoted cases:

diff --git a/lib/quotearg.c b/lib/quotearg.c
index 8e432e1..47ffc5c 100644
--- a/lib/quotearg.c
+++ b/lib/quotearg.c
@@ -709,7 +709,9 @@ quotearg_buffer_restyled (char *buffer, size_t buffersize,
           }
         }

-      if (! (((backslash_escapes && quoting_style != 
shell_always_quoting_style)
+      if (! (((backslash_escapes
+               && quoting_style != shell_always_quoting_style
+               && quoting_style != c_quoting_style)
               || elide_outer_quotes)
              && quote_these_too
              && quote_these_too[c / INT_BITS] >> (c % INT_BITS) & 1)

cheers,
Pádraig





reply via email to

[Prev in Thread] Current Thread [Next in Thread]