bug-gawk
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [bug-gawk] Unstated differences between gawk and POSIX


From: Ed Morton
Subject: Re: [bug-gawk] Unstated differences between gawk and POSIX
Date: Fri, 3 Aug 2018 10:58:02 -0500
User-agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:52.0) Gecko/20100101 Thunderbird/52.9.1

OK, so apparently gawk really doesn't behave per the POSIX standard when comparing numeric-string to numeric-string:

In the Expressions In Awk section POSIX says:
Comparisons (with the '<', "<=", "!=", "==", '>', and ">=" operators) shall be made numerically if both operands are numeric, if one is numeric and the other has a string value that is a numeric string, or if one is numeric and the other has the uninitialized value. Otherwise, operands shall be converted to strings as required and a string comparison shall be made
so numeric-string to numeric-string should be a string comparison while gawk does a numeric comparison:
$ echo '07 7.0' | gawk '$1 == $2'
07 7.0

$ awk -v x="07" -v y="7.0" 'BEGIN{print (x == y ? "t" : "f")}'
t
That behavior is as documented under Variable Typing in the gawk manual.

I wouldn't want the default behavior of gawk to change as I think it makes sense but I also think SOMETHING should happen to give users a heads up. Maybe gawk should do as POSIX defines when run with --posix and maybe there should be a note in the documentation about it? Or maybe someone should work on getting the POSIX spec updated to reflect how gawk and other awks actually behave (idk how you'd even go about that but if you do....).

Thoughts?

    Ed.

On 8/2/2018 10:30 AM, Ed Morton wrote:
When reading the POSIX standard for gawk recently (https://groups.google.com/d/msg/comp.lang.awk/qYhgpz08pN8/ihidWMLmCQAJ) I noticed a couple of cases wrt comparisons where gawk behaves differently from what [parts of] the POSIX spec say. I posted a question about it at comp.lang.awk (https://groups.google.com/d/msg/comp.lang.awk/qYhgpz08pN8/9wbMr9XKCQAJ) to see if I might be just misunderstanding the spec but that doesn't seem to be the case. The issues are:

1) POSIX states that a numeric-string vs numeric-string comparison should be a string comparison but gawks does a numeric comparison.
2) POSIX states that an uninitialized field (e.g. $3 or higher when you only have 2 input fields) should have the same value as an uninitialized array element or scalar variable and that value is zero-or-null but in a comparison gawk treats an uninitialized field as a string with value null only.

You can see the comp.lang.awk question for details and references if you're interested. I think it's POSIX that should change to match gawks behavior (which is also the behavior of at least some other awks) rather than the other way around but idk how to make that happen (yet) or if/when it will happen, nor do I know if gawk should have something documented about the differences meantime so I at least wanted to give you a heads up about the issues and you can decide if anything should change on the gawk tool or documentation side.

     Ed.


reply via email to

[Prev in Thread] Current Thread [Next in Thread]