bug-gawk
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [bug-gawk] Strange matching behavior


From: Aharon Robbins
Subject: Re: [bug-gawk] Strange matching behavior
Date: Sun, 12 Oct 2014 15:13:13 +0300
User-agent: Heirloom mailx 12.5 6/20/10

Hi Davide.

Sorry for the delay in replying. I've been under a deadline which
has now passed.

I think this is a real bug. Unfortunately, it's down in the guts of
regex routines parser, which means it won't be easy to fix.

I will try to spend some time on it "soon".

Thanks,

Arnold

> Date: Sun, 28 Sep 2014 19:17:05 +0200
> From: Davide Brini <address@hidden>
> To: address@hidden
> Subject: [bug-gawk] Strange matching behavior
>
> I'm not sure what's going on here. Per POSIX, "]" can appear inside a
> bracket expression if it's the first character following the opening [.
> Indeed, this works (and always has, AFAIR):
>
> $ printf '%s\n' '[' ']' '.' '*' '$' | awk '/[][]/ { print "<" $0 ">" }'
> <[>
> <]>
>
> However, if I add some other characters to the list ("." and "*" here), gawk
> fails:
>
> $ printf '%s\n' '[' ']' '.' '*' '$' | awk '/[][.*]/ { print "<" $0 ">" }'
> awk: cmd. line:1: error: Unmatched [ or [^: /[][.*]/
>
> The gawk documentation says:
>
> "To include one of the characters '\', ']', '-', or '^' in a bracket
> expression, put a '\' in front of it."
>
> I think that, as said, that escape should not be necessary, but anyway,
> let's try it:
>
> $ printf '%s\n' '[' ']' '.' '*' '$' | awk '/[\][.*]/ { print "<" $0 ">" }'
> awk: cmd. line:1: error: Unmatched [ or [^: /[\][.*]/
>
>
> But if I add another "$" (without escaping anything), it works again:
>
> $ printf '%s\n' '[' ']' '.' '*' '$' | awk '/[][$.*]/ { print "<" $0 ">" }'
> <[>
> <]>
> <.>
> <*>
> <$>
>
> In fact, it works if *any* character is added to the expression instead of
> "$"; however, it has to be exactly in that spot. If it's added after the
> "." or after the "*" it does not work; it must be after the "[".
>
> Finally, if I escape *both* square brackets, it works in the problematic
> case too:
>
> $ printf '%s\n' '[' ']' '.' '*' '$' | awk '/[\]\[.*]/ { print "<" $0 ">" }'
> <[>
> <]>
> <.>
> <*>
>
> $ gawk --version
> GNU Awk 4.1.1, API: 1.1 (GNU MPFR 3.1.2-p10, GNU MP 6.0.0)
> ...
>
> -- 
> D.
>



reply via email to

[Prev in Thread] Current Thread [Next in Thread]