bug-coreutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#29832: Bug in 'ls -FQ': incorrectly quoted characters


From: hackerb9
Subject: bug#29832: Bug in 'ls -FQ': incorrectly quoted characters
Date: Sat, 23 Dec 2017 23:14:28 -0800

Hello!

Thank you for your continued improvements and maintenance of 'ls' in
coreutils-8.28.

I have found a somewhat subtle bug. I've included a patch and a test script
to make sure there are no regressions.

In short, the problem is that a backlash is incorrectly put in front of the
characters @, =, >, and |. This only happens when using filetype indicators
and C style quoting.

  $ touch @
  $ ls
  @
  $ ls -F
  @
  $ ls -Q
  "@"
  $ ls -FQ
  "\@"        # <-- incorrect

There should be no backslash before the @.

This was a little bit tricky to track down as the code responsible is not
commented and appears to have been prematurely optimized by the programmer:

  ls.c:2141
  if (file_type <= indicator_style)
    {
      char const *p;
      for (p = &"*=>@|"[indicator_style - file_type]; *p; p++)
        set_char_quoting (filename_quoting_options, *p, 1);
    }

Basically, it says, if --file-type was used, then we should be sure to
quote '=', '>', and '@'. If --classify is used, then we should do the same
but also add in '*'. Or, to put it in less compressed C:

  if (indicator_style == file_type)
    {
      char const *p;
      for (p = "=>@|"; *p; p++)
        set_char_quoting (filename_quoting_options, *p, 1);
    }
  if (indicator_style == classify)
    {
      char const *p;
      for (p = "*=>@|"; *p; p++)
        set_char_quoting (filename_quoting_options, *p, 1);
    }

However, there is no need to quote those characters when the quoting-style
is set to C. In fact, it is an error to do so as these are not valid C
escape sequences.

I am guessing the idea of the code was that those characters would not be
backslash escaped, but rather the entire filename would be wrapped in
quotes so the indicator character cannot be confused for part of the
filename. And, indeed, this does work for some of the quoting styles:
"shell[-escape][-always]" work that way. Strangely, "c-maybe" also works
that way.

  $ ls -F --quoting-style=literal
  @
  $ ls -F --quoting-style=shell
  '@'
  $ ls -F --quoting-style=shell-always
  '@'
  $ ls -F --quoting-style=shell-escape
  '@'
  $ ls -F --quoting-style=shell-escape-always
  '@'
  $ ls -F --quoting-style=c-maybe
  "@"        # <-- This is correct!

I spent some time debugging it, but I didn't find the perfect fix as I
don't understand why it was written the way it was. Rather, I found
something that I think is an improvement and a step towards doing things
right. I've made a patch that simply checks if the quoting style is C and
skips the problematic lines. Ideally, somebody who understands why this
code was written this way in the first place can make a better patch, but
this should work for now and not break anything for anybody else.

Please find attached below, my patch for src/ls.c and a regression test
script to be placed in tests/ls/quoting-style-c.sh.

--b9

P.S. This bug may also apply to the -b (--quoting-style=escape). However,
the output from it appears to be more like a mish-mash of C and shell
escapes. (For example, tabs are shown as "\t", as in C, but spaces are
shown as "\ ", as in bourne shell. ) The -Q option, on the other hand,
generates perfectly acceptable C strings (tested with gcc and python),
except for those few buggy characters fixed in this patch.

Attachment: ls-FQ.patch
Description: Text Data

Attachment: quoting-style-c.sh
Description: Bourne shell script


reply via email to

[Prev in Thread] Current Thread [Next in Thread]