bug-gawk
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

doc tweak re backslashes in bracket expressions


From: Ed Morton
Subject: doc tweak re backslashes in bracket expressions
Date: Sun, 3 Nov 2024 07:50:10 -0600
User-agent: Mozilla Thunderbird

Just a small tweak suggestion for the gawk documentation regarding backslashes inside bracket expressions.

https://www.gnu.org/software/gawk/manual/html_node/Bracket-Expressions.html currently says (**emphasis mine**):

The treatment of ‘\’ in bracket expressions is compatible with other awk implementations **and is also mandated by POSIX**.

but POSIX, at least this 2024 incarnation of the spec, seems pretty clear (see references below*) that a backslash inside a bracket expression is not an escape character so per POSIX these would be compliant behavior:

$ printf 'a\\d\n' | grep -E '[\]'
a\d

$ printf 'a\\d\n' | sed -En '/[\]/p'
a\d

while these would not:

$ printf 'a\\d\n' | awk '/[\]/'
awk: cmd. line:1: /[\]/
awk: cmd. line:1:  ^ unterminated regexp

$ printf 'a\\d\n' | awk --posix '/[\]/'
awk: cmd. line:1: /[\]/
awk: cmd. line:1:  ^ unterminated regexp

so maybe either remove that "and is also mandated by POSIX" statement or provide a reference to where that behavior IS mandated by POSIX to clear up any confusion.

    Ed.

*From the current, 2024, POSIX regexp spec, https://pubs.opengroup.org/onlinepubs/9799919799/basedefs/V1_chap09.html (**emphasis mine**):

> [9.1 Regular Expression Definitions](https://pubs.opengroup.org/onlinepubs/9799919799/basedefs/V1_chap09.html#tag_09_01)
> ...
> escape sequence
>
> The escape character followed by any single character, which is
> thereby "escaped". The escape character is a \<backslash\> that is
> **neither in a bracket expression** nor itself escaped.

which tells us that a backslash within a bracket expression is not an escape character, and this:

> [9.3.5 RE Bracket Expression](https://pubs.opengroup.org/onlinepubs/9799919799/basedefs/V1_chap09.html#tag_09_03_05)
>
> ... When the bracket
> expression appears within an ERE, the special characters ... and '```\```' (... > and \<backslash\>, respectively) shall **lose their special meaning within
> the bracket expression**

which reiterates that a backslash within a bracket expression has no special meaning, and there's nothing I can see in [the POSIX awk spec](https://pubs.opengroup.org/onlinepubs/9799919799/utilities/awk.html) to override the above definitions.


reply via email to

[Prev in Thread] Current Thread [Next in Thread]