[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Fw: awk standard incompatibility
From: |
Paul Eggert |
Subject: |
Re: Fw: awk standard incompatibility |
Date: |
Thu, 22 Jun 2006 20:04:32 -0700 |
User-agent: |
Gnus/5.1008 (Gnus v5.10.8) Emacs/21.4 (gnu/linux) |
Denys Vlasenko <address@hidden> writes:
>> http://busybox.net/bugs/view.php?id=914
>>
>> but now it looks like it's gawk being a bit non-standard compliant,
No, gawk and Busybox awk both conform to the standard here. It's a
bug in the script, which uses the following pattern:
/^[\t ]*UDK_3_0_0[\t ]*{/
POSIX does not allow this pattern. See POSIX 1003.1-2004 section 9.4.3
<http://www.opengroup.org/onlinepubs/009695399/basedefs/xbd_chap09.html#tag_09_04_03>,
which says that the behavior of an ERE is undefined "if a left-brace
is not part of a valid interval expression". Hence an awk
implementation can do what it likes with this pattern: Busybox awk can
print a diagnostic, and gawk can treat "{" as if it were "[{]", and
both behaviors conform to the standard.
It is true that gawk does not conform to POSIX in programs like
'/a{3}/ {print}', unless you set the POSIXLY_CORRECT environment
variable or use the --posix option. However, gawk is in good company
here; e.g., Solaris 10 'nawk' behaves like gawk, not like POSIX. So,
in practice, portable scripts cannot use unescaped "{" in regular
expressions, even as part of POSIX-blessed interval expressions.