[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: quotearg improvements [was: filenames in error messages]
From: |
Eric Blake |
Subject: |
Re: quotearg improvements [was: filenames in error messages] |
Date: |
Wed, 13 Feb 2008 20:57:51 -0700 |
User-agent: |
Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.9) Gecko/20071031 Thunderbird/2.0.0.9 Mnenhy/0.7.5.666 |
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
According to Bruno Haible on 2/13/2008 8:13 PM:
| Sorry, but you lost me here. Where did the C trigraphs come into play?
Because the quotearg module _already_ did trigraph quoting (try ls
- --quoting-style=c for an example). The question is whether the new
c_maybe style (or if we come up with a better name for it), designed for
use in unambiguous error message output, should continue using that
trigraph code or ditch it. I think the consensus is to ditch it by
default, although it might still be worth leaving the option in the code
to provide it (quotearg, as a module, is useful for more than just error
messages).
|> For C strings, the code already outputs \a, \b, \f, \n, \r, \t, \v, \\,
|> \"; and for all other non-printable characters, a 3-digit \nnn octal
|
| So you want to escape, in an UTF-8 locale, all non-ASCII characters or
bytes?
| So that a Japanese user, for an error in file をつけた時でも, gets to read
|
\343\202\222\343\201\244\343\201\221\343\201\237\346\231\202\343\201\247\343\202\202
?
No. The existing quotearg code was already locale-dependent, and tries
its hardest to recognize valid multibyte sequences as printable. It only
prints an octal escape for invalid multibyte sequences and/or nonprintable
characters, according to the current locale's notion of printable.
However, when in the C locale, the notion of what is printable is fuzzy as
you change machines; I am often annoyed that on cygwin, where there is no
locale besides C, isprint('\0xc0') is false, even though it renders in the
terminal as a single-byte printable character (accented A, as if by
iso-8859-1) - to date, I've simply maintained a cygwin-specific patch to
quotearg that treats all characters above 0x80 as printable, even when the
C locale claims otherwise.
|
| This is far, far away from the original goal, and also neglects the
principle
| of minimal surprise. I mean, if the goal is to solve ambiguities, then
please
| add enough escapes to solve ambiguities, but not more than that!
OK - then I think we're settled here - since we are using "" on the
outside of ambiguous strings, we do not need to worry about quoting most
remaining shell special characters. Space, ?, (), [], {}, |, etc. can all
be output as-is - with no change to the quotearg module.
- --
Don't work too hard, make some time for fun as well!
Eric Blake address@hidden
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.5 (Cygwin)
Comment: Public key at home.comcast.net/~ericblake/eblake.gpg
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org
iD8DBQFHs7w+84KuGfSFAYARApuLAJ4p6TkDWc4n0NgZXHaMQSbNWhF8GwCeLgwM
3KDZv7r/5dZ+mBy3m1e7p5I=
=3nCJ
-----END PGP SIGNATURE-----
- Re: filenames in error messages, (continued)
- Re: quotearg improvements [was: filenames in error messages], Karl Berry, 2008/02/13
- Re: quotearg improvements [was: filenames in error messages], Eric Blake, 2008/02/13
- Re: quotearg improvements [was: filenames in error messages], Karl Berry, 2008/02/13
- Re: quotearg improvements [was: filenames in error messages], Eric Blake, 2008/02/13
- Re: quotearg improvements [was: filenames in error messages], Bruno Haible, 2008/02/13
- Re: quotearg improvements [was: filenames in error messages],
Eric Blake <=
- quotearg and trigraphs [was: quotearg improvements], Eric Blake, 2008/02/13
- Re: quotearg and trigraphs [was: quotearg improvements], Eric Blake, 2008/02/16
Re: filenames in error messages, Karl Berry, 2008/02/07
Re: filenames in error messages, Bruno Haible, 2008/02/13