bug-gawk
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[bug-gawk] Mistaken interpretation of the POSIX standard causes Gawk 4.1


From: Michael Klement
Subject: [bug-gawk] Mistaken interpretation of the POSIX standard causes Gawk 4.1.1 not to recognize newlines as field separators with -P
Date: Mon, 25 May 2015 13:08:03 -0400

Hi,

Since at least the 2004 edition of the POSIX standard (http://pubs.opengroup.org/onlinepubs/009695399/utilities/awk.html) newlines should always be considered field separators, irrespective of the value of `FS`;
from the "Variables and Special Variables" section:

"a <newline> shall always be a field separator, no matter what the value of FS is."

This language is still present in the 2013 edition, (http://pubs.opengroup.org/onlinepubs/9699919799/utilities/awk.html).

(The unfortunate thing is that the 2004 edition contained sloppy language in the "Description" section, which seemingly contradicts the above:

"a field is a string of non- <blank>s"

This has been corrected in the 2013 edition:

"a field is a string of non- <blank> non- <newline> characters"
)

By contrast, Gawk - as of version  4.1.1 - states in its manual under -P, --posix

Only space and tab act as field separators when FS is set to a single space, newline does not.

and acts accordingly (except when RS is set to the empty string).

The bottom line is: If the above change in behavior is the only one that -P / --posix effects, this option should never have been introduced in the first place, because Gawks *default* behavior is actually the POSIX-compliant one, and using the option - somewhat ironically - makes Gawk NON-compliant.

You can find further discussion and examples at http://stackoverflow.com/a/30406868/45375 

Regards,

Michael

reply via email to

[Prev in Thread] Current Thread [Next in Thread]