bug-gawk
[Top][All Lists]

## Re: [bug-gawk] Interesting floating point behavior

 From: Nelson H. F. Beebe Subject: Re: [bug-gawk] Interesting floating point behavior Date: Fri, 20 Jan 2012 10:20:20 -0700 (MST)

```Robert Kennedy <address@hidden> reports puzzlement over the
awk computation "a=\$1; b=a*10000; c=b%100" that produces these values:

>> 0.69     6900    100

Here is what is happening:

In 128-bit binary IEEE 754 arithmetic:

hoc128> a = 0.69
hoc128> b = a * 10000
hoc128> c = b % 100
hoc128> println a, b, c
0.69  6_899.999_999_999_999_999_999_999_999_999_998_42  99

In 128-bit decimal IEEE 754 arithmetic:

hocd128> a = 0.69
hocd128> b = a * 10000
hocd128> c = b % 100
hocd128> println a, b, c
0.69  6_900 0

Because most decimal fractions, like 0.69, are not exactly
representable in binary arithmetic, you often see the string-of-9s
phenomenon when you do the inexact round-trip decimal -> binary ->
decimal.

This has nothing to do with gawk: it is a fact of life that arises
from inexact base conversion.

A famous example used to illustrate the need for decimal arithmetic is
sales tax computation: 5% on a purchase of \$0.70: in decimal
arithmetic, the answer is 1.05 * 0.70 = \$0.735, and tax man's rounding
says you owe \$0.74.

In binary arithmetic, 0.70 is not exactly representable, no matter
what your precision is, and the computation produces
0.734_999_999_999_999_99, which rounds down to 0.73, cheating the tax
authorities of 0.01.

They DO care about this, and in most jurisdictions, such arithmetic
MUST be done in decimal.

While a difference of a penny is insignificant when you buy a Ferrari,
it can add up to millions of dollars annually in businesses that have
large numbers of small transactions, like telephone companies and
grocery stores.

IEEE 754-2008, the revision of IEEE 754-1985, includes decimal
arithmetic, and additional rounding rules demanded by tax laws (e.g.,
round-ties-upward: 0.735 -> 0.74).  So far, only IBM z-Series and IBM
PowerPC 7 have full support of the 2008 standard.

I have versions of mawk and nawk that use decimal arithmetic instead
of binary arithmetic: for them, Robert's experiment produces this
output:

echo -e "Input\t*10000\t%100"; \
for i in 0.67 0.68 0.69 0.70; do \
echo \$i | dmawk '{a=\$1; b=a*10000; c=b%100; print
a,"\t",b,"\t",c}'; \
done

Input   *10000  %100
0.67     6700    0
0.68     6800    0
0.69     6900    0
0.70     7000    0

They are built with the 128-bit decimal format, which supplies exactly
34 decimal digits.  Here is a computation of the machine epsilon,
which should be 10**(-34 + 1):

% dmawk -f macheps.awk
machine epsilon = 1e-33 = 10**-33
machine epsilon = 1e-33 = 10**-33

Versions of gcc with support for decimal arithmetic, and binary
packages with hoc, dmawk, dnawk, and dlua are available here:

http://www.math.utah.edu/pub/mathcw/

My large book on that library is essentially done, with some minor
tweaks in progress before going to the publisher.

-------------------------------------------------------------------------------
- Nelson H. F. Beebe                    Tel: +1 801 581 5254                  -
- University of Utah                    FAX: +1 801 581 4148                  -
- Department of Mathematics, 110 LCB    Internet e-mail: address@hidden  -
- 155 S 1400 E RM 233                       address@hidden  address@hidden -
- Salt Lake City, UT 84112-0090, USA    URL: http://www.math.utah.edu/~beebe/ -
-------------------------------------------------------------------------------

```

reply via email to