help-gawk
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: ensure numeric comparison


From: Neil R. Ormos
Subject: Re: ensure numeric comparison
Date: Sun, 8 May 2022 12:40:16 -0500 (CDT)

Peng Yu wrote:
> david kerns wrote:
>> Peng Yu wrote:
>>> david kenrs wrote:
>>>> Peng Yu wrote:

>>>>> I want to make sure whether the following
>>>>> code is guaranteed to compare $1 and $2
>>>>> numerically as long as the input of the 1st
>>>>> and 2nd fields are legitimate numbers or
>>>>> empty strings (which are considered as 0).

>>>>> awk -e '{ print ($1 < $2) }' < input

>>>>> Or there is any corner case when even the
>>>>> 1st and 2nd fields are legitimate numbers or
>>>>> empty strings, they may be compared as
>>>>> strings.

>>>> pretty sure the only way to guarantee numeric
>>>> comparison is to add 0

>>>> awk -e '{ print ($1 +0 < $2 + 0) }' < input

>>>> (not sure if it's required on both sides, but
>>>> it can't hurt)

>>> Is there a test case for which `awk -e '{
>>> print ($1 < $2) }' < input` can not compare $1
>>> and $2 numerically given the input columes 1
>>> and 2 are numbers or empty strings?

>> see 6.3.2.1 String Type versus Numeric Type in
>> https://www.gnu.org/software/gawk/manual/gawk.html

> But my question involves empty string which is
> "unassigned". It is not explained in the 3x3
> table in that section.

The table in the manual does appear to answer your question because the cited 
section of the manual also states, "Uninitialized variables also have the 
strnum attribute."

See also Sec. 4.2 Examining Fields: 

| If you try to reference a field beyond the last
| one (such as $8 when the record has only seven
| fields), you get the empty string. (If used in a
| numeric operation, you get zero.)

I don't know if there are any test cases that satisfy the "[...] can not 
compare $1 and $2 numerically [...]" constraint.

However, if you do not trust that missing elements of $N are "uninitialized 
variables" that have the strnum attribute, you can resolve the question at a 
practical level by following David Kerns' suggestion of explicitly coercing the 
comparands to numeric type.

As a simplification to David Kern's suggestion, I would use the unary "+" 
operator:

  print ( +$1 < +$2 )



reply via email to

[Prev in Thread] Current Thread [Next in Thread]