bug-gawk
[Top][All Lists]

## Re: [bug-gawk] Percent Signs in External Commands on Windows

 From: Eli Zaretskii Subject: Re: [bug-gawk] Percent Signs in External Commands on Windows Date: Wed, 11 Apr 2012 19:40:09 +0300

> Date: Wed, 11 Apr 2012 08:27:21 -0600 (MDT)
> From: "Nelson H. F. Beebe" <address@hidden>
>
> With all this discussion of the brain-damage of Microsoft Windows
> shells in their processing of command-line arguments, no one seems to
> have mentioned the possibility of representing special characters as
> octal or hexadecimal escape sequences to hide them from special
> interpretations as quoting characters, escape characters, or
> pattern-matching characters.

Thanks, but I don't think this is workable, see below.

> So, the questions for the users of gawk on Windows are:
>
>       Does a command invoked from gawk via system() or a pipeline
>
>               programname \042 \047 \052
>
>       pass the literal characters quotation-mark, apostrophe, and
>       asterisk to the program, or does the program see character
>       strings of four characters each, as shown?

The latter, of course: the special treatment of a backslash is a
feature of Posix shells, which the Windows shells don't implement.  If
they did, it would wreak havoc on Windows file names that use
backslashes as directory separators.

>       And does \052 get converted to * which gets expanded into
>       a list of matching files, or does it do the job of hiding
>       a literal asterisk?

No, it does not, and for the same reason.

>       And do those backslashes need to be doubled, as in
>
>               programname \\042 \\047 \\052
>
>       or are they just treated as ordinary characters?

They are treated as ordinary characters, again, because Windows file
names use the d:\foo\bar format.

> Some further background for non-Windows users: unlike the case in Unix
> where the shells have ALWAYS handled command-line argument quoting and
> pattern matching, in PC-DOS, arguments are passed directly to the
> program, which must then take on a job that the DOS shell should have
> done once and for all.  Then Windows came along, first running on top
> of the DOS shell, and then later, on bare hardware with the shell
> supplied on top of Windows as CMD.EXE.  Because the command-line was
> never popular among most Windows users, it did not get much attention,
> and its painful deficiencies never got properly fixed.

This is not entirely accurate.  PC-DOS also had a shell, called
COMMAND.COM, and it also included some "handling" of quotes.
Wildcards indeed were never handled by any Microsoft shell, not on
DOS, not on Windows.  They are expanded by the application's startup
code, so that the argv[] array in C programs gets the results of
expansion.

CMD is actually not that bad, it has many useful features that are
almost unknown to most Windows users.  But handling of quoted
arguments is still horridly wrong, presumably for some backward
compatibility reasons.

> In 2006, Microsoft introduced the PowerShell, which is an optional
> interesting features, but alas, one that is not even close to a POSIX
> shell.  Because PowerShell is optional, gawk programmers might not
> want to rely on its being available, but it should be investigated
> whether it solves the horrid problems of CMD.EXE: it it does, perhaps
> we should just declare that gawk users on Windows are strongly
> recommended to install PowerShell.

I studied PowerShell when it was first released, and concluded that it
is not a panacea.  It solves some problems (e.g., it supports '..'
style of quoting), but introduces others.  And, as you say, it is not
Posix-compliant anyway.  So I don't recommend it as a solution for
the problems discussed here, and certainly don't agree that Gawk
should rely on it.

Thanks.