Re: filenames in error messages

Eric Blake
Subject: Re: filenames in error messages
Wed, 06 Feb 2008 18:57:59 -0700
[adding bug-gnulib, as owner of the quotearg module]

According to Karl Berry on 1/21/2008 3:47 PM:
| There was a suggestion on bug-standards
| http://lists.gnu.org/archive/html/bug-standards/2008-01/msg00000.html
| about enclosing the filename in <...> when it's a url.
| Two questions:
| 1) should we try to invent an unambiguous syntax for specifying "source"
|    names?
| 2) Does anyone know of anything besides next-error in Emacs that parses
|    these messages?  It seems inevitable that other programs (such as IDE's)
|    would, but I don't know of any specifically.
| Actually, I don't think the issue is about url's specifically so much as
| special characters in general.  Although next-error manages to parse
| simple filenames containing colons, I expect if a filename was
| sufficiently strange, it could be fooled.  It seems like it would be
| nice to have an unambiguous syntax.

According to Karl Berry on 1/30/2008 4:30 PM:
|     Are we talking about something like this?
|       http://www.example.com/some\:path
| It's the : after the http that concerns me.  Every url would have to
| have that \ after the protocol, and as Eric says, then cut-and-paste
| wouldn't work, which is a drag.  Using quotes as in "http://..."; doesn't
| suffer from that.

I noticed several potential issues with the quotearg module.  One is the
limitation mentioned above that quotearg_colon would output this when
using the "escape" style:


which cannot be copied.  Another is that quotearg_char ONLY outputs a
backslash before the char IF the quoting style is not "literal", "shell",
or "shell-always".  Those three styles ignore the request to escape
additional characters.  Another problem with those three styles is that
when used with quotearg_n_style_mem, they cannot handle embedded NUL (the
'\0' character is not escaped, but the API does not output a length of the
resulting quoted buffer).  Sure, you can use quotearg_buffer or
quotearg_alloc, but then you have to manage the string yourself, whereas
it would be kind of nice being able to do
~ error (...,"%s", quotearg_n_style_mem (n, shell_quoting_style, str, len))

Is there any reason that shortcuts like quotearg_style_mem are not
defined, in contrast with quotearg_style(s,a) short for

I also noticed that once you call quotearg_colon, all future calls to
quotearg() will behave as though they were quotearg_colon unless you
manually call set_char_quoting(NULL,':',0).  In other words,
quotearg_colon does not remember nor restore the prior state of the colon
character in the default quoting options.

One thing I like about the "shell" quoting style is that its use of quotes
is dependent on the contents - in the common case, where nothing needs
quoting, the input can be reused as the output.  Along the lines of the
problem at hand, I wonder if the GNU Coding Standards should use a style
like the following:

If the file name comes from the portable set ([-_a-zA-Z0-9./], and
probably a few others like + that are not required by POSIX), then
messages relative to a file can look like:

program:file:line: message

If the file name contains problematic characters (including the : in a
URL, or non-printable characters), then the file name is surrounded in
quotes, and uses C escapes for the problematic characters:

program:"embedded colon:, quote\", and spaces":line: message
program:"http://example.com/file":line: message

Should I go ahead and hack on a new quoting style in quotearg.c that can
be used in this manner, adding "" around the string only if an escape
sequence or quote_these_too character is encountered?

Don't work too hard, make some time for fun as well!

Eric Blake
