[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[gawkdiffs] [SCM] gawk branch, master, updated. 63aeb055437534122ddb774
From: 
Arnold Robbins 
Subject: 
[gawkdiffs] [SCM] gawk branch, master, updated. 63aeb055437534122ddb774b7eecc261ab6e592a 
Date: 
Sun, 12 Aug 2012 11:15:38 +0000 
This is an automated email from the git hooks/postreceive script. It was
generated because a ref change was pushed to the repository containing
the project "gawk".
The branch, master has been updated
via 63aeb055437534122ddb774b7eecc261ab6e592a (commit)
from eb126595c90ac7f6242415dfd29d6c88a8f0f0a2 (commit)
Those revisions listed above that are new to this repository have
not appeared on any other notification email; so we list those
revisions in full, below.
 Log 
http://git.sv.gnu.org/cgit/gawk.git/commit/?id=63aeb055437534122ddb774b7eecc261ab6e592a
commit 63aeb055437534122ddb774b7eecc261ab6e592a
Author: Arnold D. Robbins <address@hidden>
Date: Sun Aug 12 14:15:19 2012 +0300
Rework material on arithmetic.
diff git a/doc/ChangeLog b/doc/ChangeLog
index a04db48..495bead 100644
 a/doc/ChangeLog
+++ b/doc/ChangeLog
@@ 1,3 +1,9 @@
+20120812 Arnold D. Robbins <address@hidden>
+
+ * gawk.texi: Merged discussion of numbers from Appendix C into
+ the chapter on arbitrary precision arithmetic. Did some surgery
+ on that chapter to organize it a little better.
+
20120401 Andrew J. Schorr <address@hidden>
* gawk.texi: Replace documentation of removed functions update_ERRNO and
diff git a/doc/gawk.info b/doc/gawk.info
index 6846063..c3559f3 100644
 a/doc/gawk.info
+++ b/doc/gawk.info
@@ 359,21 +359,29 @@ texts being (a) (see below), and with the BackCover
Texts being (b)
* I18N Portability:: `awk'level portability issues.
* I18N Example:: A simple i18n example.
* Gawk I18N:: `gawk' is also internationalized.
+* General Arithmetic:: An introduction to computer arithmetic.
+* Floating Point Issues:: Stuff to know about floatingpoint numbers.
+* String Conversion Precision:: The String Value Can Lie.
+* Unexpected Results:: Floating Point Numbers Are Not Abstract
+ Numbers.
+* POSIX Floating Point Problems:: Standards Versus Existing Practice.
+* Integer Programming:: Effective integer programming.
* Floatingpoint Programming:: Effective floatingpoint programming.
* Floatingpoint Representation:: Binary floatingpoint representation.
* Floatingpoint Context:: Floatingpoint context.
* Rounding Mode:: Floatingpoint rounding mode.
+* Gawk and MPFR:: How `gawk' provides
+ aribitraryprecision arithmetic.
* Arbitrary Precision Floats:: Arbitrary precision floatingpoint
arithmetic with `gawk'.
* Setting Precision:: Setting the working precision.
* Setting Rounding Mode:: Setting the rounding mode.
* Floatingpoint Constants:: Representing floatingpoint constants.
* Changing Precision:: Changing the precision of a number.
* Exact Arithmetic:: Exact arithmetic with floatingpoint
numbers.
* Integer Programming:: Effective integer programming.
* Arbitrary Precision Integers:: Arbitrary precision integer
 arithmetic with `gawk'.
* MPFR and GMP Libraries:: Information about the MPFR and GMP
libraries.
+* Exact Arithmetic:: Exact arithmetic with floatingpoint
+ numbers.
+* Arbitrary Precision Integers:: Arbitrary precision integer arithmetic with
+ `gawk'.
* Nondecimal Data:: Allowing nondecimal input data.
* Array Sorting:: Facilities for controlling array traversal
and sorting arrays.
@@ 438,14 +446,14 @@ texts being (a) (see below), and with the BackCover
Texts being (b)
* Anagram Program:: Finding anagrams from a dictionary.
* Signature Program:: People do amazing things with too much time
on their hands.
* Debugging:: Introduction to `gawk' Debugger.
+* Debugging:: Introduction to `gawk' debugger.
* Debugging Concepts:: Debugging in General.
* Debugging Terms:: Additional Debugging Concepts.
* Awk Debugging:: Awk Debugging.
* Sample Debugging Session:: Sample Debugging Session.
+* Sample Debugging Session:: Sample debugging session.
* Debugger Invocation:: How to Start the Debugger.
* Finding The Bug:: Finding the Bug.
* List of Debugger Commands:: Main Commands.
+* List of Debugger Commands:: Main debugger commands.
* Breakpoint Control:: Control of Breakpoints.
* Debugger Execution Control:: Control of Execution.
* Viewing And Changing Data:: Viewing and Changing Data.
@@ 453,8 +461,8 @@ texts being (a) (see below), and with the BackCover Texts
being (b)
* Debugger Info:: Obtaining Information about the Program and
the Debugger State.
* Miscellaneous Debugger Commands:: Miscellaneous Commands.
* Readline Support:: Readline Support.
* Limitations:: Limitations and Future Plans.
+* Readline Support:: Readline support.
+* Limitations:: Limitations and future plans.
* V7/SVR3.1:: The major changes between V7 and System V
Release 3.1.
* SVR4:: Minor changes between System V Releases 3.1
@@ 519,11 +527,6 @@ texts being (a) (see below), and with the BackCover Texts
being (b)
day.
* Basic High Level:: The high level view.
* Basic Data Typing:: A very quick intro to data types.
* Floating Point Issues:: Stuff to know about floatingpoint numbers.
* String Conversion Precision:: The String Value Can Lie.
* Unexpected Results:: Floating Point Numbers Are Not Abstract
 Numbers.
* POSIX Floating Point Problems:: Standards Versus Existing Practice.
To Miriam, for making me complete.
@@ 2580,8 +2583,8 @@ A number of environment variables influence how `gawk'
behaves.
* AWKPATH Variable:: Searching directories for `awk'
programs.
* AWKLIBPATH Variable:: Searching directories for `awk'
 shared libraries.
+* AWKLIBPATH Variable:: Searching directories for `awk' shared
+ libraries.
* Other Environment Variables:: The environment variables.
@@ 3718,7 +3721,6 @@ have to be named on the `awk' command line (*note
Getline::).
* Getline:: Reading files under explicit program control
using the `getline' function.
* Read Timeout:: Reading input with a timeout.

* Command line directories:: What happens if you put a directory on the
command line.
@@ 13711,8 +13713,8 @@ usage messages, warnings, and fatal errors in the local
language.
File: gawk.info, Node: Arbitrary Precision Arithmetic, Next: Advanced
Features, Prev: Internationalization, Up: Top
11 Arbitrary Precision Arithmetic with `gawk'
*********************************************
+11 Arithmetic and Arbitrary Precision Arithmetic with `gawk'
+************************************************************
There's a credibility gap: We don't know how much of the
computer's answers to believe. Novice computer users solve this
@@ 13721,49 +13723,27 @@ File: gawk.info, Node: Arbitrary Precision
Arithmetic, Next: Advanced Features
answer are significant. Disillusioned computer users have just the
opposite approach; they are constantly afraid that their answers
are almost meaningless.

Donald Knuth(1)
 This minor node decsribes how to use the arbitrary precision (also
known as "multiple precision" or "infinite precision") numeric
capabilites in `gawk' to produce maximally accurate results when you
need it. But first you should check if your version of `gawk' supports
arbitrary precision arithmetic. The easiest way to find out is to look
at the output of the following command:

 $ gawk version
  GNU Awk 4.1.0 (GNU MPFR 3.1.0, GNU MP 5.0.3)
  Copyright (C) 1989, 19912012 Free Software Foundation.
 ...
+ This major node discusses issues that you may encounter when
+performing arithmetic. It begins by discussing some of the general
+atributes of computer arithmetic, along with how this can influence
+what you see when running `awk' programs. This discussion applies to
+all versions of `awk'.
 `gawk' uses the GNU MPFR (http://www.mpfr.org) and GNU MP
(http://gmplib.org) (GMP) libraries for arbitrary precision arithmetic
on numbers. So if you do not see the names of these libraries in the
output, then your version of `gawk' does not support arbitrary
precision arithmetic.

 Even if you aren't interested in arbitrary precision arithmetic, you
may still benifit from knowing about how `gawk' handles numbers in
general, and the limitations of doing arithmetic with ordinary `gawk'
numbers.
+ Then the discussion moves on to "arbitrary precsion arithmetic", a
+feature which is specific to `gawk'.
* Menu:
* Floatingpoint Programming:: Effective Floatingpoint Programming.
* Floatingpoint Representation:: Binary Floatingpoint Representation.
* Floatingpoint Context:: Floatingpoint Context.
* Rounding Mode:: Floatingpoint Rounding Mode.
* Arbitrary Precision Floats:: Arbitrary Precision Floatingpoint
 Arithmetic with `gawk'.
* Setting Precision:: Setting the Working Precision.
* Setting Rounding Mode:: Setting the Rounding Mode.
* Floatingpoint Constants:: Representing Floatingpoint Constants.
* Changing Precision:: Changing the Precision of a Number.
* Exact Arithmetic:: Exact Arithmetic with Floatingpoint
Numbers.
* Integer Programming:: Effective Integer Programming.
* Arbitrary Precision Integers:: Arbitrary Precision Integer
 Arithmetic with `gawk'.
* MPFR and GMP Libraries:: Information About the MPFR and GMP
Libraries.
+* General Arithmetic:: An introduction to computer arithmetic.
+* Floatingpoint Programming:: Effective floatingpoint programming.
+* Gawk and MPFR:: How `gawk' provides
+ aribitraryprecision arithmetic.
+* Arbitrary Precision Floats:: Arbitrary precision floatingpoint arithmetic
+ with `gawk'.
+* Arbitrary Precision Integers:: Arbitrary precision integer arithmetic with
+ `gawk'.
 Footnotes 
@@ 13772,190 +13752,495 @@ numbers.
229.
File: gawk.info, Node: Floatingpoint Programming, Next: Floatingpoint
Representation, Up: Arbitrary Precision Arithmetic
+File: gawk.info, Node: General Arithmetic, Next: Floatingpoint Programming,
Up: Arbitrary Precision Arithmetic
11.1 Effective Floatingpoint Programming
=========================================
+11.1 A General Description of Computer Arithmetic
+=================================================
Numerical programming is an extensive area; if you need to develop
sophisticated numerical algorithms then `gawk' may not be the ideal
tool, and this documentation may not be sufficient. It might require a
book or two to communicate how to compute with ideal accuracy and
precision and the result often depends on the particular application.
+Within computers, there are two kinds of numeric values: "integers" and
+"floatingpoint". In school, integer values were referred to as
+"whole" numbersthat is, numbers without any fractional part, such as
+1, 42, or 17. The advantage to integer numbers is that they represent
+values exactly. The disadvantage is that their range is limited. On
+most systems, this range is 2,147,483,648 to 2,147,483,647. However,
+many systems now support a range from 9,223,372,036,854,775,808 to
+9,223,372,036,854,775,807.
 NOTE: A floatingpoint calculation's "accuracy" is how close it
 comes to the real value. This is as opposed to the "precision",
 which usually refers to the number of bits used to represent the
 number (see the Wikipedia article
 (http://en.wikipedia.org/wiki/Accuracy_and_precision) for more
 information).
+ Integer values come in two flavors: "signed" and "unsigned". Signed
+values may be negative or positive, with the range of values just
+described. Unsigned values are always positive. On most systems, the
+range is from 0 to 4,294,967,295. However, many systems now support a
+range from 0 to 18,446,744,073,709,551,615.
 Binary floatingpoint representations and arithmetic are inexact.
Simple values like 0.1 cannot be precisely represented using binary
floatingpoint numbers, and the limited precision of floatingpoint
numbers means that slight changes in the order of operations or the
precision of intermediate storage can change the result. To make
matters worse with arbitrary precision floatingpoint, you can set the
precision before starting a computation, but then you cannot be sure of
the number of significant decimal places in the final result.
+ Floatingpoint numbers represent what are called "real" numbers;
+i.e., those that do have a fractional part, such as 3.1415927. The
+advantage to floatingpoint numbers is that they can represent a much
+larger range of values. The disadvantage is that there are numbers
+that they cannot represent exactly. `awk' uses "double precision"
+floatingpoint numbers, which can hold more digits than "single
+precision" floatingpoint numbers.
 Sometimes you need to think more about what you really want and
what's really happening. Consider the two numbers in the following
example:
+ There a several important issues to be aware of, described next.
 x = 0.875 # 1/2 + 1/4 + 1/8
 y = 0.425
+* Menu:
 Unlike the number in `y', the number stored in `x' is exactly
representable in binary since it can be written as a finite sum of one
or more fractions whose denominators are all powers of two. When
`gawk' reads a floatingpoint number from program source, it
automatically rounds that number to whatever precision your machine
supports. If you try to print the numeric content of a variable using
an output format string of `"%.17g"', it may not produce the same
number as you assigned to it:
+* Floating Point Issues:: Stuff to know about floatingpoint numbers.
+* Integer Programming:: Effective integer programming.
 $ gawk 'BEGIN { x = 0.875; y = 0.425
 > printf("%0.17g, %0.17g\n", x, y) }'
  0.875, 0.42499999999999999
+
+File: gawk.info, Node: Floating Point Issues, Next: Integer Programming,
Up: General Arithmetic
 Often the error is so small you do not even notice it, and if you do,
you can always specify how much precision you would like in your output.
Usually this is a format string like `"%.15g"', which when used in the
previous example, produces an output identical to the input.
+11.1.1 FloatingPoint Number Caveats
+
 Because the underlying representation can be little bit off from the
exact value, comparing floats to see if they are equal is generally not
a good idea. Here is an example where it does not work like you expect:
+As mentioned earlier, floatingpoint numbers represent what are called
+"real" numbers, i.e., those that have a fractional part. `awk' uses
+double precision floatingpoint numbers to represent all numeric
+values. This minor node describes some of the issues involved in using
+floatingpoint numbers.
 $ gawk 'BEGIN { print (0.1 + 12.2 == 12.3) }'
  0
+ There is a very nice paper on floatingpoint arithmetic
+(http://www.validlab.com/goldberg/paper.pdf) by David Goldberg, "What
+Every Computer Scientist Should Know About Floatingpoint Arithmetic,"
+`ACM Computing Surveys' *23*, 1 (199103), 548. This is worth reading
+if you are interested in the details, but it does require a background
+in computer science.
 The loss of accuracy during a single computation with floatingpoint
numbers usually isn't enough to worry about. However, if you compute a
value which is the result of a sequence of floating point operations,
the error can accumulate and greatly affect the computation itself.
Here is an attempt to compute the value of the constant pi using one of
its many series representations:
+* Menu:
 BEGIN {
 x = 1.0 / sqrt(3.0)
 n = 6
 for (i = 1; i < 30; i++) {
 n = n * 2.0
 x = (sqrt(x * x + 1)  1) / x
 printf("%.15f\n", n * x)
 }
 }
+* String Conversion Precision:: The String Value Can Lie.
+* Unexpected Results:: Floating Point Numbers Are Not Abstract
+ Numbers.
+* POSIX Floating Point Problems:: Standards Versus Existing Practice.
 When run, the early errors propagating through later computations
cause the loop to terminate prematurely after an attempt to divide by
zero.
+
+File: gawk.info, Node: String Conversion Precision, Next: Unexpected
Results, Up: Floating Point Issues
 $ gawk f pi.awk
  3.215390309173475
  3.159659942097510
  3.146086215131467
  3.142714599645573
 ...
  3.224515243534819
  2.791117213058638
  0.000000000000000
 error> gawk: pi.awk:6: fatal: division by zero attempted
+11.1.1.1 The String Value Can Lie
+.................................
 Here is one more example where the inaccuracies in internal
representations yield an unexpected result:
+Internally, `awk' keeps both the numeric value (double precision
+floatingpoint) and the string value for a variable. Separately, `awk'
+keeps track of what type the variable has (*note Typing and
+Comparison::), which plays a role in how variables are used in
+comparisons.
 $ gawk 'BEGIN {
 > for (d = 1.1; d <= 1.5; d += 0.1)
 > i++
 > print i
 > }'
  4
+ It is important to note that the string value for a number may not
+reflect the full value (all the digits) that the numeric value actually
+contains. The following program (`values.awk') illustrates this:
 Can computation using aribitrary precision help with the previous
examples? If you are impatient to know, see *note Exact Arithmetic::.
+ {
+ sum = $1 + $2
+ # see it for what it is
+ printf("sum = %.12g\n", sum)
+ # use CONVFMT
+ a = "<" sum ">"
+ print "a =", a
+ # use OFMT
+ print "sum =", sum
+ }
 Instead of aribitrary precision floatingpoint arithmetic, often all
you need is an adjustment of your logic or a different order for the
operations in your calculation. The stability and the accuracy of the
computation of the constant pi in the previous example can be enhanced
by using the following simple algebraic transformation:
+This program shows the full value of the sum of `$1' and `$2' using
+`printf', and then prints the string values obtained from both
+automatic conversion (via `CONVFMT') and from printing (via `OFMT').
 (sqrt(x * x + 1)  1) / x = x / (sqrt(x * x + 1) + x)
+ Here is what happens when the program is run:
 There is no need to be unduly suspicious about the results from
floatingpoint arithmetic. The lesson to remember is that
floatingpoint math is always more complex than the math using pencil
and paper. In order to take advantage of the power of computer
floatingpoint, you need to know its limitations and work within them.
For most casual use of floatingpoint arithmetic, you will often get
the expected result in the end if you simply round the display of your
final results to the correct number of significant decimal digits.
Avoid presenting numerical data in a manner that implies better
precision than is actually the case.
+ $ echo 3.654321 1.2345678  awk f values.awk
+  sum = 4.8888888
+  a = <4.88889>
+  sum = 4.88889

File: gawk.info, Node: Floatingpoint Representation, Next: Floatingpoint
Context, Prev: Floatingpoint Programming, Up: Arbitrary Precision Arithmetic
+ This makes it clear that the full numeric value is different from
+what the default string representations show.
11.2 Binary Floatingpoint Representation
=========================================
+ `CONVFMT''s default value is `"%.6g"', which yields a value with at
+least six significant digits. For some applications, you might want to
+change it to specify more precision. On most modern machines, most of
+the time, 17 digits is enough to capture a floatingpoint number's
+value exactly.(1)
Although floatingpoint representations vary from machine to machine,
the most commonly encountered representation is that defined by the
IEEE 754 Standard. An IEEE754 format value has three components:
+  Footnotes 
 * a sign bit telling whether the number is positive or negative,
+ (1) Pathological cases can require up to 752 digits (!), but we
+doubt that you need to worry about this.
 * an "exponent" giving its order of magnitude, E,
+
+File: gawk.info, Node: Unexpected Results, Next: POSIX Floating Point
Problems, Prev: String Conversion Precision, Up: Floating Point Issues
 * and a "significand", S, specifying the actual digits of the number.
+11.1.1.2 Floating Point Numbers Are Not Abstract Numbers
+........................................................
 The value of the number is then S * 2^E. The first bit of a
nonzero binary significand is always one, so the significand in an
IEEE754 format only includes the fractional part, leaving the leading
one implicit.
+Unlike numbers in the abstract sense (such as what you studied in high
+school or college arithmetic), numbers stored in computers are limited
+in certain ways. They cannot represent an infinite number of digits,
+nor can they always represent things exactly. In particular,
+floatingpoint numbers cannot always represent values exactly. Here is
+an example:
 Three of the standard IEEE754 types are 32bit single precision,
64bit double precision and 128bit quadruple precision. The standard
also specifies extended precision formats to allow greater precisions
and larger exponent ranges.
+ $ awk '{ printf("%010d\n", $1 * 100) }'
+ 515.79
+  0000051579
+ 515.80
+  0000051579
+ 515.81
+  0000051580
+ 515.82
+  0000051582
+ Ctrld

File: gawk.info, Node: Floatingpoint Context, Next: Rounding Mode, Prev:
Floatingpoint Representation, Up: Arbitrary Precision Arithmetic
+This shows that some values can be represented exactly, whereas others
+are only approximated. This is not a "bug" in `awk', but simply an
+artifact of how computers represent numbers.
11.3 Floatingpoint Context
===========================
+ Another peculiarity of floatingpoint numbers on modern systems is
+that they often have more than one representation for the number zero!
+In particular, it is possible to represent "minus zero" as well as
+regular, or "positive" zero.
A floatingpoint context defines the environment for arithmetic
operations. It governs precision, sets rules for rounding and limits
range for exponents. The context has the following primary components:
+ This example shows that negative and positive zero are distinct
+values when stored internally, but that they are in fact equal to each
+other, as well as to "regular" zero:
`precision'
 Precision of the floatingpoint format in bits.
+ $ gawk 'BEGIN { mz = 0 ; pz = 0
+ > printf "0 = %g, +0 = %g, (0 == +0) > %d\n", mz, pz, mz == pz
+ > printf "mz == 0 > %d, pz == 0 > %d\n", mz == 0, pz == 0
+ > }'
+  0 = 0, +0 = 0, (0 == +0) > 1
+  mz == 0 > 1, pz == 0 > 1
`emax'
 Maximum exponent allowed for this format.
+ It helps to keep this in mind should you process numeric data that
+contains negative zero values; the fact that the zero is negative is
+noted and can affect comparisons.
`emin'
 Minimum exponent allowed for this format.
+
+File: gawk.info, Node: POSIX Floating Point Problems, Prev: Unexpected
Results, Up: Floating Point Issues
`underflow behavior'
 The format may or may not support gradual underflow.
+11.1.1.3 Standards Versus Existing Practice
+...........................................
`rounding'
 The rounding mode of this context.
+Historically, `awk' has converted any nonnumeric looking string to the
+numeric value zero, when required. Furthermore, the original
+definition of the language and the original POSIX standards specified
+that `awk' only understands decimal numbers (base 10), and not octal
+(base 8) or hexadecimal numbers (base 16).
 *note tableieeeformats:: lists the precision and exponent field
values for the basic IEEE754 binary formats:
+ Changes in the language of the 2001 and 2004 POSIX standards can be
+interpreted to imply that `awk' should support additional features.
+These features are:
Name Total bits Precision emin emax
+ * Interpretation of floating point data values specified in
+ hexadecimal notation (`0xDEADBEEF'). (Note: data values, _not_
+ source code constants.)
+
+ * Support for the special IEEE 754 floating point values "Not A
+ Number" (NaN), positive Infinity ("inf") and negative Infinity
+ ("inf"). In particular, the format for these values is as
+ specified by the ISO 1999 C standard, which ignores case and can
+ allow machinedependent additional characters after the `nan' and
+ allow either `inf' or `infinity'.
+
+ The first problem is that both of these are clear changes to
+historical practice:
+
+ * The `gawk' maintainer feels that supporting hexadecimal floating
+ point values, in particular, is ugly, and was never intended by the
+ original designers to be part of the language.
+
+ * Allowing completely alphabetic strings to have valid numeric
+ values is also a very severe departure from historical practice.
+
+ The second problem is that the `gawk' maintainer feels that this
+interpretation of the standard, which requires a certain amount of
+"language lawyering" to arrive at in the first place, was not even
+intended by the standard developers. In other words, "we see how you
+got where you are, but we don't think that that's where you want to be."
+
+ Recognizing the above issues, but attempting to provide compatibility
+with the earlier versions of the standard, the 2008 POSIX standard
+added explicit wording to allow, but not require, that `awk' support
+hexadecimal floating point values and special values for "Not A Number"
+and infinity.
+
+ Although the `gawk' maintainer continues to feel that providing
+those features is inadvisable, nevertheless, on systems that support
+IEEE floating point, it seems reasonable to provide _some_ way to
+support NaN and Infinity values. The solution implemented in `gawk' is
+as follows:
+
+ * With the `posix' commandline option, `gawk' becomes "hands
+ off." String values are passed directly to the system library's
+ `strtod()' function, and if it successfully returns a numeric
+ value, that is what's used.(1) By definition, the results are not
+ portable across different systems. They are also a little
+ surprising:
+
+ $ echo nanny  gawk posix '{ print $1 + 0 }'
+  nan
+ $ echo 0xDeadBeef  gawk posix '{ print $1 + 0 }'
+  3735928559
+
+ * Without `posix', `gawk' interprets the four strings `+inf',
+ `inf', `+nan', and `nan' specially, producing the corresponding
+ special numeric values. The leading sign acts a signal to `gawk'
+ (and the user) that the value is really numeric. Hexadecimal
+ floating point is not supported (unless you also use
+ `nondecimaldata', which is _not_ recommended). For example:
+
+ $ echo nanny  gawk '{ print $1 + 0 }'
+  0
+ $ echo +nan  gawk '{ print $1 + 0 }'
+  nan
+ $ echo 0xDeadBeef  gawk '{ print $1 + 0 }'
+  0
+
+ `gawk' does ignore case in the four special values. Thus `+nan'
+ and `+NaN' are the same.
+
+  Footnotes 
+
+ (1) You asked for it, you got it.
+
+
+File: gawk.info, Node: Integer Programming, Prev: Floating Point Issues,
Up: General Arithmetic
+
+11.1.2 Mixing Integers And Floatingpoint
+
+
+As has been mentioned already, `gawk' ordinarily uses hardware double
+precision with 64bit IEEE binary floatingpoint representation for
+numbers on most systems. A large integer like 9007199254740997 has a
+binary representation that, although finite, is more than 53 bits long;
+it must also be rounded to 53 bits. The biggest integer that can be
+stored in a C `double' is usually the same as the largest possible
+value of a `double'. If your system `double' is an IEEE 64bit
+`double', this largest possible value is an integer and can be
+represented precisely. What more should one know about integers?
+
+ If you want to know what is the largest integer, such that it and
+all smaller integers can be stored in 64bit doubles without losing
+precision, then the answer is 2^53. The next representable number is
+the even number 2^53 + 2, meaning it is unlikely that you will be able
+to make `gawk' print 2^53 + 1 in integer format. The range of integers
+exactly representable by a 64bit double is [2^53, 2^53]. If you ever
+see an integer outside this range in `gawk' using 64bit doubles, you
+have reason to be very suspicious about the accuracy of the output.
+Here is a simple program with erroneous output:
+
+ $ gawk 'BEGIN { i = 2^53  1; for (j = 0; j < 4; j++) print i + j }'
+  9007199254740991
+  9007199254740992
+  9007199254740992
+  9007199254740994
+
+ The lesson is to not assume that any large integer printed by `gawk'
+represents an exact result from your computation, especially if it wraps
+around on your screen.
+
+
+File: gawk.info, Node: Floatingpoint Programming, Next: Gawk and MPFR,
Prev: General Arithmetic, Up: Arbitrary Precision Arithmetic
+
+11.2 Understanding Floatingpoint Programming
+=============================================
+
+Numerical programming is an extensive area; if you need to develop
+sophisticated numerical algorithms then `gawk' may not be the ideal
+tool, and this documentation may not be sufficient. It might require
+digesting a book or two to really internalize how to compute with ideal
+accuracy and precision and the result often depends on the particular
+application.
+
+ NOTE: A floatingpoint calculation's "accuracy" is how close it
+ comes to the real value. This is as opposed to the "precision",
+ which usually refers to the number of bits used to represent the
+ number (see the Wikipedia article
+ (http://en.wikipedia.org/wiki/Accuracy_and_precision) for more
+ information).
+
+ There are two options for doing floatingpoint calculations:
+hardware floatingpoint (as used by standard `awk' and the default for
+`gawk'), and "arbitraryprecision" floatingpoint, which is software
+based. This major node aims to provide enough information to
+understand both, and then will focus on `gawk''s facilities for the
+latter.
+
+ Binary floatingpoint representations and arithmetic are inexact.
+Simple values like 0.1 cannot be precisely represented using binary
+floatingpoint numbers, and the limited precision of floatingpoint
+numbers means that slight changes in the order of operations or the
+precision of intermediate storage can change the result. To make
+matters worse, with arbitrary precision floatingpoint, you can set the
+precision before starting a computation, but then you cannot be sure of
+the number of significant decimal places in the final result.
+
+ Sometimes, before you start to write any code, you should think more
+about what you really want and what's really happening. Consider the
+two numbers in the following example:
+
+ x = 0.875 # 1/2 + 1/4 + 1/8
+ y = 0.425
+
+ Unlike the number in `y', the number stored in `x' is exactly
+representable in binary since it can be written as a finite sum of one
+or more fractions whose denominators are all powers of two. When
+`gawk' reads a floatingpoint number from program source, it
+automatically rounds that number to whatever precision your machine
+supports. If you try to print the numeric content of a variable using
+an output format string of `"%.17g"', it may not produce the same
+number as you assigned to it:
+
+ $ gawk 'BEGIN { x = 0.875; y = 0.425
+ > printf("%0.17g, %0.17g\n", x, y) }'
+  0.875, 0.42499999999999999
+
+ Often the error is so small you do not even notice it, and if you do,
+you can always specify how much precision you would like in your output.
+Usually this is a format string like `"%.15g"', which when used in the
+previous example, produces an output identical to the input.
+
+ Because the underlying representation can be little bit off from the
+exact value, comparing floatingpoint values to see if they are equal
+is generally not a good idea. Here is an example where it does not
+work like you expect:
+
+ $ gawk 'BEGIN { print (0.1 + 12.2 == 12.3) }'
+  0
+
+ The loss of accuracy during a single computation with floatingpoint
+numbers usually isn't enough to worry about. However, if you compute a
+value which is the result of a sequence of floating point operations,
+the error can accumulate and greatly affect the computation itself.
+Here is an attempt to compute the value of the constant pi using one of
+its many series representations:
+
+ BEGIN {
+ x = 1.0 / sqrt(3.0)
+ n = 6
+ for (i = 1; i < 30; i++) {
+ n = n * 2.0
+ x = (sqrt(x * x + 1)  1) / x
+ printf("%.15f\n", n * x)
+ }
+ }
+
+ When run, the early errors propagating through later computations
+cause the loop to terminate prematurely after an attempt to divide by
+zero.
+
+ $ gawk f pi.awk
+  3.215390309173475
+  3.159659942097510
+  3.146086215131467
+  3.142714599645573
+ ...
+  3.224515243534819
+  2.791117213058638
+  0.000000000000000
+ error> gawk: pi.awk:6: fatal: division by zero attempted
+
+ Here is one more example where the inaccuracies in internal
+representations yield an unexpected result:
+
+ $ gawk 'BEGIN {
+ > for (d = 1.1; d <= 1.5; d += 0.1)
+ > i++
+ > print i
+ > }'
+  4
+
+ Can computation using aribitrary precision help with the previous
+examples? If you are impatient to know, see *note Exact Arithmetic::.
+
+ Instead of aribitrary precision floatingpoint arithmetic, often all
+you need is an adjustment of your logic or a different order for the
+operations in your calculation. The stability and the accuracy of the
+computation of the constant pi in the previous example can be enhanced
+by using the following simple algebraic transformation:
+
+ (sqrt(x * x + 1)  1) / x = x / (sqrt(x * x + 1) + x)
+
+ There is no need to be unduly suspicious about the results from
+floatingpoint arithmetic. The lesson to remember is that
+floatingpoint arithmetic is always more complex than the arithmetic
+using pencil and paper. In order to take advantage of the power of
+computer floatingpoint, you need to know its limitations and work
+within them. For most casual use of floatingpoint arithmetic, you will
+often get the expected result in the end if you simply round the
+display of your final results to the correct number of significant
+decimal digits. And, avoid presenting numerical data in a manner that
+implies better precision than is actually the case.
+
+* Menu:
+
+* Floatingpoint Representation:: Binary floatingpoint representation.
+* Floatingpoint Context:: Floatingpoint context.
+* Rounding Mode:: Floatingpoint rounding mode.
+
+
+File: gawk.info, Node: Floatingpoint Representation, Next: Floatingpoint
Context, Up: Floatingpoint Programming
+
+11.2.1 Binary Floatingpoint Representation
+
+
+Although floatingpoint representations vary from machine to machine,
+the most commonly encountered representation is that defined by the
+IEEE 754 Standard. An IEEE754 format value has three components:
+
+ * A sign bit telling whether the number is positive or negative.
+
+ * An "exponent" giving its order of magnitude, E.
+
+ * A "significand", S, specifying the actual digits of the number.
+
+ The value of the number is then S * 2^E. The first bit of a
+nonzero binary significand is always one, so the significand in an
+IEEE754 format only includes the fractional part, leaving the leading
+one implicit.
+
+ Three of the standard IEEE754 types are 32bit single precision,
+64bit double precision and 128bit quadruple precision. The standard
+also specifies extended precision formats to allow greater precisions
+and larger exponent ranges.
+
+ The significand is stored in "normalized" format, which means that
+the first bit is always a one.
+
+
+File: gawk.info, Node: Floatingpoint Context, Next: Rounding Mode, Prev:
Floatingpoint Representation, Up: Floatingpoint Programming
+
+11.2.2 Floatingpoint Context
+
+
+A floatingpoint "context" defines the environment for arithmetic
+operations. It governs precision, sets rules for rounding, and limits
+the range for exponents. The context has the following primary
+components:
+
+"Precision"
+ Precision of the floatingpoint format in bits.
+
+"emax"
+ Maximum exponent allowed for this format.
+
+"emin"
+ Minimum exponent allowed for this format.
+
+"Underflow behavior"
+ The format may or may not support gradual underflow.
+
+"Rounding"
+ The rounding mode of this context.
+
+ *note tableieeeformats:: lists the precision and exponent field
+values for the basic IEEE754 binary formats:
+
+Name Total bits Precision emin emax

Single 32 24 126 +127
Double 64 53 1022 +1023
Quadruple 128 113 16382 +16383
Table 11.1: Basic IEEE Formats
+Table 11.1: Basic IEEE Format Context Values
NOTE: The precision numbers include the implied leading one that
gives them one extra bit of significand.
@@ 13977,28 +14262,27 @@ corresponding to 64bit binary with 53 bits of
precision.
IEEE754 binary formats support subnormal numbers.
File: gawk.info, Node: Rounding Mode, Next: Arbitrary Precision Floats,
Prev: Floatingpoint Context, Up: Arbitrary Precision Arithmetic
+File: gawk.info, Node: Rounding Mode, Prev: Floatingpoint Context, Up:
Floatingpoint Programming
11.4 Floatingpoint Rounding Mode
=================================
+11.2.3 Floatingpoint Rounding Mode
+
The "rounding mode" specifies the behavior for the results of numerical
operations when discarding extra precision. Each rounding mode indicates
how the least significant returned digit of a rounded result is to be
calculated. The `ROUNDMODE' variable (*note Setting Rounding Mode::)
provides program level control over the rounding mode. *note
tableroundingmodes:: lists the IEEE754 defined rounding modes:
+calculated. *note tableroundingmodes:: lists the IEEE754 defined
+rounding modes:
Rounding Mode IEEE Name `ROUNDMODE'

Round to nearest, ties to even `roundTiesToEven' `"N"' or `"n"'
Round toward plus Infinity `roundTowardPositive' `"U"' or `"u"'
Round toward negative Infinity `roundTowardNegative' `"D"' or `"d"'
Round toward zero `roundTowardZero' `"Z"' or `"z"'
Round to nearest, ties away `roundTiesToAway' `"A"' or `"a"'
from zero
+Rounding Mode IEEE Name
+
+Round to nearest, ties to even `roundTiesToEven'
+Round toward plus Infinity `roundTowardPositive'
+Round toward negative Infinity `roundTowardNegative'
+Round toward zero `roundTowardZero'
+Round to nearest, ties away `roundTiesToAway'
+from zero
Table 11.2: Rounding Modes
+Table 11.2: IEEE 754 Rounding Modes
The default mode `roundTiesToEven' is the most preferred, but the
least intuitive. This method does the obvious thing for most values, by
@@ 14021,7 +14305,7 @@ format floatingpoint numbers. For example:
}
}
produces the following output when run(1):
+produces the following output when run:(1)
3.5 => 4
2.5 => 2
@@ 14050,9 +14334,9 @@ the number with the larger magnitude if a tie occurs.
Some numerical analysts will tell you that your choice of rounding
style has tremendous impact on the final outcome, and advise you to
wait until final output for any rounding. Instead, you can often
achieve this goal by setting the precision initially to some value
sufficiently larger than the final desired precision, so that the
+wait until final output for any rounding. Instead, you can often avoid
+roundoff error problems by setting the precision initially to some
+value sufficiently larger than the final desired precision, so that the
accumulation of roundoff error does not influence the outcome. If you
suspect that results from your computation are sensitive to
accumulation of roundoff error, one way to be sure is to look for a
@@ 14065,9 +14349,39 @@ C library in your system does not use the IEEE754
evenrounding rule
to round halfway cases for `printf()'.
File: gawk.info, Node: Arbitrary Precision Floats, Next: Setting Precision,
Prev: Rounding Mode, Up: Arbitrary Precision Arithmetic
+File: gawk.info, Node: Gawk and MPFR, Next: Arbitrary Precision Floats,
Prev: Floatingpoint Programming, Up: Arbitrary Precision Arithmetic
+
+11.3 `gawk' + MPFR = Powerful Arithmetic
+========================================
+
+The rest of this major node decsribes how to use the arbitrary precision
+(also known as "multiple precision" or "infinite precision") numeric
+capabilites in `gawk' to produce maximally accurate results when you
+need it.
+
+ But first you should check if your version of `gawk' supports
+arbitrary precision arithmetic. The easiest way to find out is to look
+at the output of the following command:
+
+ $ gawk version
+  GNU Awk 4.1.0 (GNU MPFR 3.1.0, GNU MP 5.0.3)
+  Copyright (C) 1989, 19912012 Free Software Foundation.
+ ...
+
+ `gawk' uses the GNU MPFR (http://www.mpfr.org) and GNU MP
+(http://gmplib.org) (GMP) libraries for arbitrary precision arithmetic
+on numbers. So if you do not see the names of these libraries in the
+output, then your version of `gawk' does not support arbitrary
+precision arithmetic.
+
+ Additionally, there are a few elements available in the `PROCINFO'
+array to provide information about the MPFR and GMP libraries. *Note
+Autoset::, for more information.
+
+
+File: gawk.info, Node: Arbitrary Precision Floats, Next: Arbitrary Precision
Integers, Prev: Gawk and MPFR, Up: Arbitrary Precision Arithmetic
11.5 Arbitrary Precision Floatingpoint Arithmetic with `gawk'
+11.4 Arbitrary Precision Floatingpoint Arithmetic with `gawk'
==============================================================
`gawk' uses the GNU MPFR library for arbitrary precision floatingpoint
@@ 14081,13 +14395,14 @@ Two builtin variables `PREC' (*note Setting
Precision::) and
working precision and the rounding mode. The precision and the
rounding mode are set globally for every operation to follow.
 The default working precision for arbitrary precision floats is 53,
and the default value for `ROUNDMODE' is `"N"', which selects the
IEEE754 `roundTiesToEven' (*note Rounding Mode::) rounding mode.(1)
`gawk' uses the default exponent range in MPFR (EMAX = 2^30  1, EMIN =
EMAX) for all floatingpoint contexts. There is no explicit mechanism
to adjust the exponent range. MPFR does not implement subnormal
numbers by default, and this behavior cannot be changed in `gawk'.
+ The default working precision for arbitrary precision floatingpoint
+values is 53, and the default value for `ROUNDMODE' is `"N"', which
+selects the IEEE754 `roundTiesToEven' (*note Rounding Mode::) rounding
+mode.(1) `gawk' uses the default exponent range in MPFR (EMAX = 2^30 
+1, EMIN = EMAX) for all floatingpoint contexts. There is no explicit
+mechanism to adjust the exponent range. MPFR does not implement
+subnormal numbers by default, and this behavior cannot be changed in
+`gawk'.
NOTE: When emulating an IEEE754 format (*note Setting
Precision::), `gawk' internally adjusts the exponent range to the
@@ 14096,9 +14411,17 @@ numbers by default, and this behavior cannot be
changed in `gawk'.
NOTE: MPFR numbers are variablesize entities, consuming only as
much space as needed to store the significant digits. Since the
 performance using MPFR numbers pales in comparison to doing math
 using the underlying machine types, you should consider using only
 as much precision as needed by your program.
+ performance using MPFR numbers pales in comparison to doing
+ arithmetic using the underlying machine types, you should consider
+ using only as much precision as needed by your program.
+
+* Menu:
+
+* Setting Precision:: Setting the working precision.
+* Setting Rounding Mode:: Setting the rounding mode.
+* Floatingpoint Constants:: Representing floatingpoint constants.
+* Changing Precision:: Changing the precision of a number.
+* Exact Arithmetic:: Exact arithmetic with floatingpoint numbers.
 Footnotes 
@@ 14109,10 +14432,10 @@ computations with doubleprecision machine
floatingpoint numbers
and subnormal numbers are not implemented.
File: gawk.info, Node: Setting Precision, Next: Setting Rounding Mode,
Prev: Arbitrary Precision Floats, Up: Arbitrary Precision Arithmetic
+File: gawk.info, Node: Setting Precision, Next: Setting Rounding Mode, Up:
Arbitrary Precision Floats
11.6 Setting the Working Precision
==================================
+11.4.1 Setting the Working Precision
+
`gawk' uses a global working precision; it does not keep track of the
precision or accuracy of individual numbers. Performing an arithmetic
@@ 14166,24 +14489,38 @@ floatingpoint computations with more than 15
significant digits in
them.
Conversely, it takes a precision of 332 bits to hold an approximation
of constant pi that is accurate to 100 decimal places. You should
+of the constant pi that is accurate to 100 decimal places. You should
always add some extra bits in order to avoid the confusing roundoff
issues that occur because numbers are stored internally in binary.
File: gawk.info, Node: Setting Rounding Mode, Next: Floatingpoint
Constants, Prev: Setting Precision, Up: Arbitrary Precision Arithmetic
+File: gawk.info, Node: Setting Rounding Mode, Next: Floatingpoint
Constants, Prev: Setting Precision, Up: Arbitrary Precision Floats
11.7 Setting the Rounding Mode
==============================
+11.4.2 Setting the Rounding Mode
+
+
+The `ROUNDMODE' variable provides program level control over the
+rounding mode. The correspondance between `ROUNDMODE' and the IEEE
+rounding modes is shown in *note tablegawkroundingmodes::.
+
+Rounding Mode IEEE Name `ROUNDMODE'
+
+Round to nearest, ties to even `roundTiesToEven' `"N"' or `"n"'
+Round toward plus Infinity `roundTowardPositive' `"U"' or `"u"'
+Round toward negative Infinity `roundTowardNegative' `"D"' or `"d"'
+Round toward zero `roundTowardZero' `"Z"' or `"z"'
+Round to nearest, ties away `roundTiesToAway' `"A"' or `"a"'
+from zero
The builtin variable `ROUNDMODE' has the default value `"N"', which
selects the IEEE754 rounding mode `roundTiesToEven'. The other
possible values for `ROUNDMODE' are `"U"' for rounding mode
`roundTowardPositive', `"D"' for `roundTowardNegative', and `"Z"' for
`roundTowardZero'. `gawk' also accepts `"A"' to select the IEEE754
mode `roundTiesToAway' if your version of the MPFR library supports it;
otherwise setting `ROUNDMODE' to this value has no effect. *Note
Rounding Mode::, for the meanings of the various rounding modes.
+Table 11.3: `gawk' Rounding Modes
+
+ `ROUNDMODE' has the default value `"N"', which selects the IEEE754
+rounding mode `roundTiesToEven'. Besides the values listed in *note
+Table 11.3: tablegawkroundingmodes, `gawk' also accepts `"A"' to
+select the IEEE754 mode `roundTiesToAway' if your version of the MPFR
+library supports it; otherwise setting `ROUNDMODE' to this value has no
+effect. *Note Rounding Mode::, for the meanings of the various rounding
+modes.
Here is an example of how to change the default rounding behavior of
`printf''s output:
@@ 14192,10 +14529,10 @@ Rounding Mode::, for the meanings of the various
rounding modes.
 1.37
File: gawk.info, Node: Floatingpoint Constants, Next: Changing Precision,
Prev: Setting Rounding Mode, Up: Arbitrary Precision Arithmetic
+File: gawk.info, Node: Floatingpoint Constants, Next: Changing Precision,
Prev: Setting Rounding Mode, Up: Arbitrary Precision Floats
11.8 Representing Floatingpoint Constants
==========================================
+11.4.3 Representing Floatingpoint Constants
+
Be wary of floatingpoint constants! When reading a floatingpoint
constant from program source code, `gawk' uses the default precision,
@@ 14205,7 +14542,7 @@ the precision using `PREC' in the program text does not
change the
precision of a constant. If you need to represent a floatingpoint
constant at a higher precision than the default and cannot use a
command line assignment to `PREC', you should either specify the
constant as a string, or a rational number whenever possible. The
+constant as a string, or as a rational number whenever possible. The
following example illustrates the differences among various ways to
print a floatingpoint constant:
@@ 14222,10 +14559,10 @@ print a floatingpoint constant:
of 53.
File: gawk.info, Node: Changing Precision, Next: Exact Arithmetic, Prev:
Floatingpoint Constants, Up: Arbitrary Precision Arithmetic
+File: gawk.info, Node: Changing Precision, Next: Exact Arithmetic, Prev:
Floatingpoint Constants, Up: Arbitrary Precision Floats
11.9 Changing the Precision of a Number
=======================================
+11.4.4 Changing the Precision of a Number
+
The point is that in any variableprecision package, a decision is
made on how to treat numbers given as data, or arising in
@@ 14257,14 +14594,14 @@ or:
 Footnotes 
(1) Dirk Laurie. `Variableprecision Arithmetic Considered Perilous
 A Detective Story'. Electronic Transactions on Numerical Analysis.
+ A Detective Story'. Electronic Transactions on Numerical Analysis.
Volume 28, pp. 168173, 2008.
File: gawk.info, Node: Exact Arithmetic, Next: Integer Programming, Prev:
Changing Precision, Up: Arbitrary Precision Arithmetic
+File: gawk.info, Node: Exact Arithmetic, Prev: Changing Precision, Up:
Arbitrary Precision Floats
11.10 Exact Arithmetic with Floatingpoint Numbers
==================================================
+11.4.5 Exact Arithmetic with Floatingpoint Numbers
+
CAUTION: Never depend on the exactness of floatingpoint
arithmetic, even for apparently simple expressions!
@@ 14311,58 +14648,31 @@ range of the other.
double precision arithmetic can be adequate, and is usually much faster.
But you do need to keep in mind that every floatingpoint operation can
suffer a new rounding error with catastrophic consequences as
illustrated by our attempt to compute the value of the constant pi,
(*note Floatingpoint Programming::). Extra precision can greatly
enhance the stability and the accuracy of your computation in such
cases.

 Repeated addition is not necessarily equivalent to multiplication in
floatingpoint arithmetic. In the last example (*note Floatingpoint
Programming::), you may or may not succeed in getting the correct
result by choosing an arbitrarily large value for `PREC'. Reformulation
of the problem at hand is often the correct approach in such situations.


File: gawk.info, Node: Integer Programming, Next: Arbitrary Precision
Integers, Prev: Exact Arithmetic, Up: Arbitrary Precision Arithmetic

11.11 Effective Integer Programming
===================================

As has been mentioned already, `gawk' ordinarily uses hardware double
precision with 64bit IEEE binary floatingpoint representation for
numbers on most systems. A large integer like 9007199254740997 has a
binary representation that, although finite, is more than 53 bits long;
it must also be rounded to 53 bits. The biggest integer that can be
stored in a C `double' is usually the same as the largest possible
value of a `double'. If your system `double' is an IEEE 64bit
`double', this largest possible value is an integer and can be
represented precisely. What more should one know about integers?

 If you want to know what is the largest integer, such that it and
all smaller integers can be stored in 64bit doubles without losing
precision, then the answer is 2^53. The next representable number is
the even number 2^53 + 2, meaning it is unlikely that you will be able
to make `gawk' print 2^53 + 1 in integer format. The range of integers
exactly representable by a 64bit double is [2^53, 2^53]. If you ever
see an integer outside this range in `gawk' using 64bit doubles, you
have reason to be very suspicious about the accuracy of the output.
Here is a simple program with erroneous output:
+illustrated by our attempt to compute the value of the constant pi
+(*note Floatingpoint Programming::). Extra precision can greatly
+enhance the stability and the accuracy of your computation in such
+cases.
 $ gawk 'BEGIN { i = 2^53  1; for (j = 0; j < 4; j++) print i + j }'
  9007199254740991
  9007199254740992
  9007199254740992
  9007199254740994
+ Repeated addition is not necessarily equivalent to multiplication in
+floatingpoint arithmetic. In the example in *note Floatingpoint
+Programming:::
 The lesson is to not assume that any large integer printed by `gawk'
represents an exact result from your computation, especially if it wraps
around on your screen.
+ $ gawk 'BEGIN {
+ > for (d = 1.1; d <= 1.5; d += 0.1)
+ > i++
+ > print i
+ > }'
+  4
+
+you may or may not succeed in getting the correct result by choosing an
+arbitrarily large value for `PREC'. Reformulation of the problem at
+hand is often the correct approach in such situations.
File: gawk.info, Node: Arbitrary Precision Integers, Next: MPFR and GMP
Libraries, Prev: Integer Programming, Up: Arbitrary Precision Arithmetic
+File: gawk.info, Node: Arbitrary Precision Integers, Prev: Arbitrary
Precision Floats, Up: Arbitrary Precision Arithmetic
11.12 Arbitrary Precision Integer Arithmetic with `gawk'
========================================================
+11.5 Arbitrary Precision Integer Arithmetic with `gawk'
+=======================================================
If the option `bignum' or `M' is specified, `gawk' performs all
integer arithmetic using GMP arbitrary precision integers. Any number
@@ 14384,7 +14694,8 @@ computes 5^4^3^2, the result of which is beyond the
limits of ordinary
If you were to compute the same value using arbitrary precision
floatingpoint values instead, the precision needed for correct output
(using the formula `prec = 3.322 * dps'), would be 3.322 x 183231, or
608693.
+608693. (Thus, the floatingpoint representation requires over 30
+times as many decimal digits!)
The result from an arithmetic operation with an integer and a
floatingpoint value is a floatingpoint value with a precision equal
@@ 14419,32 +14730,22 @@ this:
gawk M 'BEGIN { n = 13; print (n + 0.0) % 2.0 }'
You can avoid this issue altogether by specifying the number as a
float to begin with:
+floatingpoint value to begin with:
gawk M 'BEGIN { n = 13.0; print n % 2.0 }'
 Note that for the particular example above, there is unlikely to be a
reason for simply not using the following:
+ Note that for the particular example above, there is likely best to
+just use the following:
gawk M 'BEGIN { n = 13; print n % 2 }'
 Footnotes 
 (1) Weisstein, Eric W. `Sylvester's Sequence'. From MathWorldA
+ (1) Weisstein, Eric W. `Sylvester's Sequence'. From MathWorldA
Wolfram Web Resource.
`http://mathworld.wolfram.com/SylvestersSequence.html'
File: gawk.info, Node: MPFR and GMP Libraries, Prev: Arbitrary Precision
Integers, Up: Arbitrary Precision Arithmetic

11.13 Information About the MPFR and GMP Libraries
==================================================

There are a few elements available in the `PROCINFO' array to provide
information about the MPFR and GMP libraries. *Note Autoset::, for
more information.


File: gawk.info, Node: Advanced Features, Next: Library Functions, Prev:
Arbitrary Precision Arithmetic, Up: Top
12 Advanced Features of `gawk'
@@ 23421,7 +23722,6 @@ introductory texts that you should refer to instead.)
* Basic High Level:: The high level view.
* Basic Data Typing:: A very quick intro to data types.
* Floating Point Issues:: Stuff to know about floatingpoint numbers.
File: gawk.info, Node: Basic High Level, Next: Basic Data Typing, Up: Basic
Concepts
@@ 23520,7 +23820,7 @@ such as C, C++, or Ada, and then translated, or
"compiled", into a form
that the computer can execute directly.
File: gawk.info, Node: Basic Data Typing, Next: Floating Point Issues,
Prev: Basic High Level, Up: Basic Concepts
+File: gawk.info, Node: Basic Data Typing, Prev: Basic High Level, Up: Basic
Concepts
D.2 Data Values in a Computer
=============================
@@ 23540,34 +23840,10 @@ characters that comprise them. Individual variables,
as well as
numeric and string variables, are referred to as "scalar" values.
Groups of values, such as arrays, are not scalars.
 Within computers, there are two kinds of numeric values: "integers"
and "floatingpoint". In school, integer values were referred to as
"whole" numbersthat is, numbers without any fractional part, such as
1, 42, or 17. The advantage to integer numbers is that they represent
values exactly. The disadvantage is that their range is limited. On
most systems, this range is 2,147,483,648 to 2,147,483,647. However,
many systems now support a range from 9,223,372,036,854,775,808 to
9,223,372,036,854,775,807.

 Integer values come in two flavors: "signed" and "unsigned". Signed
values may be negative or positive, with the range of values just
described. Unsigned values are always positive. On most systems, the
range is from 0 to 4,294,967,295. However, many systems now support a
range from 0 to 18,446,744,073,709,551,615.

 Floatingpoint numbers represent what are called "real" numbers;
i.e., those that do have a fractional part, such as 3.1415927. The
advantage to floatingpoint numbers is that they can represent a much
larger range of values. The disadvantage is that there are numbers
that they cannot represent exactly. `awk' uses "double precision"
floatingpoint numbers, which can hold more digits than "single
precision" floatingpoint numbers. Floatingpoint issues are discussed
more fully in *note Floating Point Issues::.

 At the very lowest level, computers store values as groups of binary
digits, or "bits". Modern computers group bits into groups of eight,
called "bytes". Advanced applications sometimes have to manipulate
bits directly, and `gawk' provides functions for doing so.
+ *note General Arithmetic::, provided a basic introduction to numeric
+types (integer and floatingpoint) and how they are used in a computer.
+Please review that information, including a number of caveats that were
+presented.
While you are probably used to the idea of a number without a value
(i.e., zero), it takes a bit more getting used to the idea of
@@ 23588,6 +23864,11 @@ represents 1 times 8, plus 0 times 4, plus 1 times 2,
plus 0 times 1,
or decimal 10. Octal and hexadecimal are discussed more in *note
Nondecimalnumbers::.
+ At the very lowest level, computers store values as groups of binary
+digits, or "bits". Modern computers group bits into groups of eight,
+called "bytes". Advanced applications sometimes have to manipulate
+bits directly, and `gawk' provides functions for doing so.
+
Programs are written in programming languages. Hundreds, if not
thousands, of programming languages exist. One of the most popular is
the C programming language. The C language had a very strong influence
@@ 23605,218 +23886,6 @@ In 1999, a revised ISO C standard was approved and
released. Where it
makes sense, POSIX `awk' is compatible with 1999 ISO C.
File: gawk.info, Node: Floating Point Issues, Prev: Basic Data Typing, Up:
Basic Concepts

D.3 FloatingPoint Number Caveats
=================================

As mentioned earlier, floatingpoint numbers represent what are called
"real" numbers, i.e., those that have a fractional part. `awk' uses
double precision floatingpoint numbers to represent all numeric
values. This minor node describes some of the issues involved in using
floatingpoint numbers.

 There is a very nice paper on floatingpoint arithmetic
(http://www.validlab.com/goldberg/paper.pdf) by David Goldberg, "What
Every Computer Scientist Should Know About Floatingpoint Arithmetic,"
`ACM Computing Surveys' *23*, 1 (199103), 548. This is worth reading
if you are interested in the details, but it does require a background
in computer science.

* Menu:

* String Conversion Precision:: The String Value Can Lie.
* Unexpected Results:: Floating Point Numbers Are Not Abstract
 Numbers.
* POSIX Floating Point Problems:: Standards Versus Existing Practice.


File: gawk.info, Node: String Conversion Precision, Next: Unexpected
Results, Up: Floating Point Issues

D.3.1 The String Value Can Lie


Internally, `awk' keeps both the numeric value (double precision
floatingpoint) and the string value for a variable. Separately, `awk'
keeps track of what type the variable has (*note Typing and
Comparison::), which plays a role in how variables are used in
comparisons.

 It is important to note that the string value for a number may not
reflect the full value (all the digits) that the numeric value actually
contains. The following program (`values.awk') illustrates this:

 {
 sum = $1 + $2
 # see it for what it is
 printf("sum = %.12g\n", sum)
 # use CONVFMT
 a = "<" sum ">"
 print "a =", a
 # use OFMT
 print "sum =", sum
 }

This program shows the full value of the sum of `$1' and `$2' using
`printf', and then prints the string values obtained from both
automatic conversion (via `CONVFMT') and from printing (via `OFMT').

 Here is what happens when the program is run:

 $ echo 3.654321 1.2345678  awk f values.awk
  sum = 4.8888888
  a = <4.88889>
  sum = 4.88889

 This makes it clear that the full numeric value is different from
what the default string representations show.

 `CONVFMT''s default value is `"%.6g"', which yields a value with at
least six significant digits. For some applications, you might want to
change it to specify more precision. On most modern machines, most of
the time, 17 digits is enough to capture a floatingpoint number's
value exactly.(1)

  Footnotes 

 (1) Pathological cases can require up to 752 digits (!), but we
doubt that you need to worry about this.


File: gawk.info, Node: Unexpected Results, Next: POSIX Floating Point
Problems, Prev: String Conversion Precision, Up: Floating Point Issues

D.3.2 Floating Point Numbers Are Not Abstract Numbers


Unlike numbers in the abstract sense (such as what you studied in high
school or college math), numbers stored in computers are limited in
certain ways. They cannot represent an infinite number of digits, nor
can they always represent things exactly. In particular,
floatingpoint numbers cannot always represent values exactly. Here is
an example:

 $ awk '{ printf("%010d\n", $1 * 100) }'
 515.79
  0000051579
 515.80
  0000051579
 515.81
  0000051580
 515.82
  0000051582
 Ctrld

This shows that some values can be represented exactly, whereas others
are only approximated. This is not a "bug" in `awk', but simply an
artifact of how computers represent numbers.

 Another peculiarity of floatingpoint numbers on modern systems is
that they often have more than one representation for the number zero!
In particular, it is possible to represent "minus zero" as well as
regular, or "positive" zero.

 This example shows that negative and positive zero are distinct
values when stored internally, but that they are in fact equal to each
other, as well as to "regular" zero:

 $ gawk 'BEGIN { mz = 0 ; pz = 0
 > printf "0 = %g, +0 = %g, (0 == +0) > %d\n", mz, pz, mz == pz
 > printf "mz == 0 > %d, pz == 0 > %d\n", mz == 0, pz == 0
 > }'
  0 = 0, +0 = 0, (0 == +0) > 1
  mz == 0 > 1, pz == 0 > 1

 It helps to keep this in mind should you process numeric data that
contains negative zero values; the fact that the zero is negative is
noted and can affect comparisons.


File: gawk.info, Node: POSIX Floating Point Problems, Prev: Unexpected
Results, Up: Floating Point Issues

D.3.3 Standards Versus Existing Practice


Historically, `awk' has converted any nonnumeric looking string to the
numeric value zero, when required. Furthermore, the original
definition of the language and the original POSIX standards specified
that `awk' only understands decimal numbers (base 10), and not octal
(base 8) or hexadecimal numbers (base 16).

 Changes in the language of the 2001 and 2004 POSIX standard can be
interpreted to imply that `awk' should support additional features.
These features are:

 * Interpretation of floating point data values specified in
 hexadecimal notation (`0xDEADBEEF'). (Note: data values, _not_
 source code constants.)

 * Support for the special IEEE 754 floating point values "Not A
 Number" (NaN), positive Infinity ("inf") and negative Infinity
 ("inf"). In particular, the format for these values is as
 specified by the ISO 1999 C standard, which ignores case and can
 allow machinedependent additional characters after the `nan' and
 allow either `inf' or `infinity'.

 The first problem is that both of these are clear changes to
historical practice:

 * The `gawk' maintainer feels that supporting hexadecimal floating
 point values, in particular, is ugly, and was never intended by the
 original designers to be part of the language.

 * Allowing completely alphabetic strings to have valid numeric
 values is also a very severe departure from historical practice.

 The second problem is that the `gawk' maintainer feels that this
interpretation of the standard, which requires a certain amount of
"language lawyering" to arrive at in the first place, was not even
intended by the standard developers. In other words, "we see how you
got where you are, but we don't think that that's where you want to be."

 The 2008 POSIX standard added explicit wording to allow, but not
require, that `awk' support hexadecimal floating point values and
special values for "Not A Number" and infinity.

 Although the `gawk' maintainer continues to feel that providing
those features is inadvisable, nevertheless, on systems that support
IEEE floating point, it seems reasonable to provide _some_ way to
support NaN and Infinity values. The solution implemented in `gawk' is
as follows:

 * With the `posix' commandline option, `gawk' becomes "hands
 off." String values are passed directly to the system library's
 `strtod()' function, and if it successfully returns a numeric
 value, that is what's used.(1) By definition, the results are not
 portable across different systems. They are also a little
 surprising:

 $ echo nanny  gawk posix '{ print $1 + 0 }'
  nan
 $ echo 0xDeadBeef  gawk posix '{ print $1 + 0 }'
  3735928559

 * Without `posix', `gawk' interprets the four strings `+inf',
 `inf', `+nan', and `nan' specially, producing the corresponding
 special numeric values. The leading sign acts a signal to `gawk'
 (and the user) that the value is really numeric. Hexadecimal
 floating point is not supported (unless you also use
 `nondecimaldata', which is _not_ recommended). For example:

 $ echo nanny  gawk '{ print $1 + 0 }'
  0
 $ echo +nan  gawk '{ print $1 + 0 }'
  nan
 $ echo 0xDeadBeef  gawk '{ print $1 + 0 }'
  0

 `gawk' does ignore case in the four special values. Thus `+nan'
 and `+NaN' are the same.

  Footnotes 

 (1) You asked for it, you got it.


File: gawk.info, Node: Glossary, Next: Copying, Prev: Basic Concepts, Up:
Top
Glossary
@@ 26695,7 +26764,7 @@ Index
* dollar sign ($), $ field operator: Fields. (line 19)
* dollar sign ($), incrementing fields and arrays: Increment Ops.
(line 30)
* double precision floatingpoint: Basic Data Typing. (line 36)
+* double precision floatingpoint: General Arithmetic. (line 21)
* double quote (") <1>: Quoting. (line 37)
* double quote ("): Read Terminal. (line 25)
* double quote ("), regexp constants: Computed Regexps. (line 28)
@@ 26942,7 +27011,7 @@ Index
* floatingpoint numbers, arbitrary precision: Arbitrary Precision Arithmetic.
(line 6)
* floatingpoint, numbers <1>: Unexpected Results. (line 6)
* floatingpoint, numbers: Basic Data Typing. (line 21)
+* floatingpoint, numbers: General Arithmetic. (line 6)
* floatingpoint, numbers, AWKNUM internal type: Internals. (line 19)
* FNR variable <1>: Autoset. (line 103)
* FNR variable: Records. (line 6)
@@ 27305,8 +27374,8 @@ Index
* int() function: Numeric Functions. (line 23)
* integer, arbitrary precision: Arbitrary Precision Integers.
(line 6)
* integers: Basic Data Typing. (line 21)
* integers, unsigned: Basic Data Typing. (line 30)
+* integers: General Arithmetic. (line 6)
+* integers, unsigned: General Arithmetic. (line 15)
* interacting with other programs: I/O Functions. (line 63)
* internal constant, INVALID_HANDLE: Internals. (line 157)
* internal function, assoc_clear(): Internals. (line 68)
@@ 27378,7 +27447,7 @@ Index
* Kahrs, Ju"rgen: Acknowledgments. (line 60)
* Kasal, Stepan: Acknowledgments. (line 60)
* Kenobi, ObiWan: Undocumented. (line 6)
* Kernighan, Brian <1>: Basic Data Typing. (line 74)
+* Kernighan, Brian <1>: Basic Data Typing. (line 55)
* Kernighan, Brian <2>: Other Versions. (line 13)
* Kernighan, Brian <3>: Contributors. (line 12)
* Kernighan, Brian <4>: BTL. (line 6)
@@ 27583,7 +27652,7 @@ Index
* NR variable <1>: Autoset. (line 119)
* NR variable: Records. (line 6)
* NR variable, changing: Autoset. (line 225)
* null strings <1>: Basic Data Typing. (line 50)
+* null strings <1>: Basic Data Typing. (line 26)
* null strings <2>: Truth Values. (line 6)
* null strings <3>: Regexp Field Splitting.
(line 43)
@@ 27608,7 +27677,7 @@ Index
* numbers, converting <1>: Bitwise Functions. (line 107)
* numbers, converting: Conversion. (line 6)
* numbers, converting, to strings: Usermodified. (line 28)
* numbers, floatingpoint: Basic Data Typing. (line 21)
+* numbers, floatingpoint: General Arithmetic. (line 6)
* numbers, floatingpoint, AWKNUM internal type: Internals. (line 19)
* numbers, hexadecimal: Nondecimalnumbers. (line 6)
* numbers, NODE internal type: Internals. (line 23)
@@ 28014,7 +28083,7 @@ Index
* right angle bracket (>), >> operator (I/O) <1>: Precedence. (line 65)
* right angle bracket (>), >> operator (I/O): Redirection. (line 50)
* right shift, bitwise: Bitwise Functions. (line 32)
* Ritchie, Dennis: Basic Data Typing. (line 74)
+* Ritchie, Dennis: Basic Data Typing. (line 55)
* RLENGTH variable: Autoset. (line 201)
* RLENGTH variable, match() function and: String Functions. (line 223)
* Robbins, Arnold <1>: Future Extensions. (line 6)
@@ 28133,7 +28202,7 @@ Index
* silent debugger command: Debugger Execution Control.
(line 10)
* sin() function: Numeric Functions. (line 75)
* single precision floatingpoint: Basic Data Typing. (line 36)
+* single precision floatingpoint: General Arithmetic. (line 21)
* single quote (') <1>: Quoting. (line 31)
* single quote (') <2>: Long. (line 33)
* single quote ('): Oneshot. (line 15)
@@ 28365,7 +28434,7 @@ Index
* UNIXROOT variable, on OS/2 systems: PC Using. (line 17)
* unref() internal function: Internals. (line 92)
* unset_ERRNO() internal function: Internals. (line 141)
* unsigned integers: Basic Data Typing. (line 30)
+* unsigned integers: General Arithmetic. (line 15)
* until debugger command: Debugger Execution Control.
(line 83)
* unwatch debugger command: Viewing And Changing Data.
@@ 28499,442 +28568,444 @@ Index
Tag Table:
Node: Top1352
Node: Foreword31758
Node: Preface36103
Ref: PrefaceFootnote139156
Ref: PrefaceFootnote239262
Node: History39494
Node: Names41885
Ref: NamesFootnote143362
Node: This Manual43434
Ref: This ManualFootnote148372
Node: Conventions48472
Node: Manual History50606
Ref: Manual HistoryFootnote153876
Ref: Manual HistoryFootnote253917
Node: How To Contribute53991
Node: Acknowledgments55135
Node: Getting Started59631
Node: Running gawk62010
Node: Oneshot63196
Node: Read Terminal64421
Ref: Read TerminalFootnote166071
Ref: Read TerminalFootnote266347
Node: Long66518
Node: Executable Scripts67894
Ref: Executable ScriptsFootnote169763
Ref: Executable ScriptsFootnote269865
Node: Comments70412
Node: Quoting72879
Node: DOS Quoting77502
Node: Sample Data Files78177
Node: Very Simple81209
Node: Two Rules85808
Node: More Complex87955
Ref: More ComplexFootnote190885
Node: Statements/Lines90970
Ref: Statements/LinesFootnote195432
Node: Other Features95697
Node: When96625
Node: Invoking Gawk98772
Node: Command Line100233
Node: Options101016
Ref: OptionsFootnote1115658
Node: Other Arguments115683
Node: Naming Standard Input118341
Node: Environment Variables119435
Node: AWKPATH Variable119993
Ref: AWKPATH VariableFootnote1122582
Node: AWKLIBPATH Variable122842
Node: Other Environment Variables123439
Node: Exit Status125934
Node: Include Files126609
Node: Loading Shared Libraries130110
Node: Obsolete131335
Node: Undocumented132032
Node: Regexp132275
Node: Regexp Usage133664
Node: Escape Sequences135690
Node: Regexp Operators141453
Ref: Regexp OperatorsFootnote1148833
Ref: Regexp OperatorsFootnote2148980
Node: Bracket Expressions149078
Ref: tablecharclasses150968
Node: GNU Regexp Operators153491
Node: Casesensitivity157214
Ref: CasesensitivityFootnote1160182
Ref: CasesensitivityFootnote2160417
Node: Leftmost Longest160525
Node: Computed Regexps161726
Node: Reading Files165136
Node: Records167140
Ref: RecordsFootnote1175814
Node: Fields175851
Ref: FieldsFootnote1178884
Node: Nonconstant Fields178970
Node: Changing Fields181172
Node: Field Separators187153
Node: Default Field Splitting189782
Node: Regexp Field Splitting190899
Node: Single Character Fields194241
Node: Command Line Field Separator195300
Node: Field Splitting Summary198741
Ref: Field Splitting SummaryFootnote1201933
Node: Constant Size202034
Node: Splitting By Content206618
Ref: Splitting By ContentFootnote1210344
Node: Multiple Line210384
Ref: Multiple LineFootnote1216231
Node: Getline216410
Node: Plain Getline218626
Node: Getline/Variable220715
Node: Getline/File221856
Node: Getline/Variable/File223178
Ref: Getline/Variable/FileFootnote1224777
Node: Getline/Pipe224864
Node: Getline/Variable/Pipe227424
Node: Getline/Coprocess228531
Node: Getline/Variable/Coprocess229774
Node: Getline Notes230488
Node: Getline Summary232430
Ref: tablegetlinevariants232773
Node: Read Timeout233629
Ref: Read TimeoutFootnote1237374
Node: Command line directories237431
Node: Printing238061
Node: Print239692
Node: Print Examples241029
Node: Output Separators243813
Node: OFMT245573
Node: Printf246931
Node: Basic Printf247837
Node: Control Letters249376
Node: Format Modifiers253188
Node: Printf Examples259197
Node: Redirection261912
Node: Special Files268896
Node: Special FD269429
Ref: Special FDFootnote1273054
Node: Special Network273128
Node: Special Caveats273978
Node: Close Files And Pipes274774
Ref: Close Files And PipesFootnote1281797
Ref: Close Files And PipesFootnote2281945
Node: Expressions282095
Node: Values283227
Node: Constants283903
Node: Scalar Constants284583
Ref: Scalar ConstantsFootnote1285442
Node: Nondecimalnumbers285624
Node: Regexp Constants288683
Node: Using Constant Regexps289158
Node: Variables292213
Node: Using Variables292868
Node: Assignment Options294592
Node: Conversion296464
Ref: tablelocaleaffects301840
Ref: ConversionFootnote1302464
Node: All Operators302573
Node: Arithmetic Ops303203
Node: Concatenation305708
Ref: ConcatenationFootnote1308501
Node: Assignment Ops308621
Ref: tableassignops313609
Node: Increment Ops315017
Node: Truth Values and Conditions318487
Node: Truth Values319570
Node: Typing and Comparison320619
Node: Variable Typing321408
Ref: Variable TypingFootnote1325305
Node: Comparison Operators325427
Ref: tablerelationalops325837
Node: POSIX String Comparison329386
Ref: POSIX String ComparisonFootnote1330342
Node: Boolean Ops330480
Ref: Boolean OpsFootnote1334558
Node: Conditional Exp334649
Node: Function Calls336381
Node: Precedence339975
Node: Locales343644
Node: Patterns and Actions344733
Node: Pattern Overview345787
Node: Regexp Patterns347456
Node: Expression Patterns347999
Node: Ranges351684
Node: BEGIN/END354650
Node: Using BEGIN/END355412
Ref: Using BEGIN/ENDFootnote1358143
Node: I/O And BEGIN/END358249
Node: BEGINFILE/ENDFILE360531
Node: Empty363424
Node: Using Shell Variables363740
Node: Action Overview366025
Node: Statements368382
Node: If Statement370236
Node: While Statement371735
Node: Do Statement373779
Node: For Statement374935
Node: Switch Statement378087
Node: Break Statement380184
Node: Continue Statement382174
Node: Next Statement383967
Node: Nextfile Statement386357
Node: Exit Statement388902
Node: Builtin Variables391318
Node: Usermodified392413
Ref: UsermodifiedFootnote1400768
Node: Autoset400830
Ref: AutosetFootnote1410738
Node: ARGC and ARGV410943
Node: Arrays414794
Node: Array Basics416299
Node: Array Intro417125
Node: Reference to Elements421443
Node: Assigning Elements423713
Node: Array Example424204
Node: Scanning an Array425936
Node: Controlling Scanning428250
Ref: Controlling ScanningFootnote1433183
Node: Delete433499
Ref: DeleteFootnote1435934
Node: Numeric Array Subscripts435991
Node: Uninitialized Subscripts438174
Node: Multidimensional439802
Node: Multiscanning442896
Node: Arrays of Arrays444487
Node: Functions449132
Node: Builtin449954
Node: Calling Builtin451032
Node: Numeric Functions453020
Ref: Numeric FunctionsFootnote1456852
Ref: Numeric FunctionsFootnote2457209
Ref: Numeric FunctionsFootnote3457257
Node: String Functions457526
Ref: String FunctionsFootnote1481023
Ref: String FunctionsFootnote2481152
Ref: String FunctionsFootnote3481400
Node: Gory Details481487
Ref: tablesubescapes483166
Ref: tablesubposix92484520
Ref: tablesubproposed485863
Ref: tableposixsub487213
Ref: tablegensubescapes488759
Ref: Gory DetailsFootnote1489966
Ref: Gory DetailsFootnote2490017
Node: I/O Functions490168
Ref: I/O FunctionsFootnote1496823
Node: Time Functions496970
Ref: Time FunctionsFootnote1507862
Ref: Time FunctionsFootnote2507930
Ref: Time FunctionsFootnote3508088
Ref: Time FunctionsFootnote4508199
Ref: Time FunctionsFootnote5508311
Ref: Time FunctionsFootnote6508538
Node: Bitwise Functions508804
Ref: tablebitwiseops509362
Ref: Bitwise FunctionsFootnote1513522
Node: Type Functions513706
Node: I18N Functions514176
Node: Userdefined515803
Node: Definition Syntax516607
Ref: Definition SyntaxFootnote1521517
Node: Function Example521586
Node: Function Caveats524180
Node: Calling A Function524601
Node: Variable Scope525716
Node: Pass By Value/Reference527691
Node: Return Statement531131
Node: Dynamic Typing534112
Node: Indirect Calls534847
Node: Internationalization544532
Node: I18N and L10N545971
Node: Explaining gettext546657
Ref: Explaining gettextFootnote1551723
Ref: Explaining gettextFootnote2551907
Node: Programmer i18n552072
Node: Translator i18n556272
Node: String Extraction557065
Ref: String ExtractionFootnote1558026
Node: Printf Ordering558112
Ref: Printf OrderingFootnote1560896
Node: I18N Portability560960
Ref: I18N PortabilityFootnote1563409
Node: I18N Example563472
Ref: I18N ExampleFootnote1566107
Node: Gawk I18N566179
Node: Arbitrary Precision Arithmetic566796
Ref: Arbitrary Precision ArithmeticFootnote1569671
Node: Floatingpoint Programming569819
Node: Floatingpoint Representation575089
Node: Floatingpoint Context576193
Ref: tableieeeformats577028
Node: Rounding Mode578398
Ref: tableroundingmodes579025
Ref: Rounding ModeFootnote1582148
Node: Arbitrary Precision Floats582329
Ref: Arbitrary Precision FloatsFootnote1584370
Node: Setting Precision584681
Node: Setting Rounding Mode587439
Node: Floatingpoint Constants588356
Node: Changing Precision589775
Ref: Changing PrecisionFootnote1591175
Node: Exact Arithmetic591348
Node: Integer Programming594361
Node: Arbitrary Precision Integers596141
Ref: Arbitrary Precision IntegersFootnote1599165
Node: MPFR and GMP Libraries599311
Node: Advanced Features599696
Node: Nondecimal Data601219
Node: Array Sorting602802
Node: Controlling Array Traversal603499
Node: Array Sorting Functions611736
Ref: Array Sorting FunctionsFootnote1615410
Ref: Array Sorting FunctionsFootnote2615503
Node: Twoway I/O615697
Ref: Twoway I/OFootnote1621129
Node: TCP/IP Networking621199
Node: Profiling624043
Node: Library Functions631497
Ref: Library FunctionsFootnote1634504
Node: Library Names634675
Ref: Library NamesFootnote1638146
Ref: Library NamesFootnote2638366
Node: General Functions638452
Node: Strtonum Function639405
Node: Assert Function642335
Node: Round Function645661
Node: Cliff Random Function647204
Node: Ordinal Functions648220
Ref: Ordinal FunctionsFootnote1651290
Ref: Ordinal FunctionsFootnote2651542
Node: Join Function651751
Ref: Join FunctionFootnote1653522
Node: Gettimeofday Function653722
Node: Data File Management657437
Node: Filetrans Function658069
Node: Rewind Function662208
Node: File Checking663595
Node: Empty Files664689
Node: Ignoring Assigns666919
Node: Getopt Function668472
Ref: Getopt FunctionFootnote1679776
Node: Passwd Functions679979
Ref: Passwd FunctionsFootnote1688954
Node: Group Functions689042
Node: Walking Arrays697126
Node: Sample Programs698695
Node: Running Examples699360
Node: Clones700088
Node: Cut Program701312
Node: Egrep Program711157
Ref: Egrep ProgramFootnote1718930
Node: Id Program719040
Node: Split Program722656
Ref: Split ProgramFootnote1726175
Node: Tee Program726303
Node: Uniq Program729106
Node: Wc Program736535
Ref: Wc ProgramFootnote1740801
Ref: Wc ProgramFootnote2741001
Node: Miscellaneous Programs741093
Node: Dupword Program742281
Node: Alarm Program744312
Node: Translate Program749061
Ref: Translate ProgramFootnote1753448
Ref: Translate ProgramFootnote2753676
Node: Labels Program753810
Ref: Labels ProgramFootnote1757181
Node: Word Sorting757265
Node: History Sorting761149
Node: Extract Program762988
Ref: Extract ProgramFootnote1770471
Node: Simple Sed770599
Node: Igawk Program773661
Ref: Igawk ProgramFootnote1788818
Ref: Igawk ProgramFootnote2789019
Node: Anagram Program789157
Node: Signature Program792225
Node: Debugger793325
Node: Debugging794277
Node: Debugging Concepts794710
Node: Debugging Terms796566
Node: Awk Debugging799163
Node: Sample Debugging Session800055
Node: Debugger Invocation800575
Node: Finding The Bug801904
Node: List of Debugger Commands808392
Node: Breakpoint Control809726
Node: Debugger Execution Control813390
Node: Viewing And Changing Data816750
Node: Execution Stack820106
Node: Debugger Info821573
Node: Miscellaneous Debugger Commands825554
Node: Readline Support830999
Node: Limitations831830
Node: Language History834082
Node: V7/SVR3.1835594
Node: SVR4837915
Node: POSIX839357
Node: BTL840365
Node: POSIX/GNU841099
Node: Common Extensions846390
Node: Ranges and Locales847497
Ref: Ranges and LocalesFootnote1852101
Node: Contributors852322
Node: Installation856583
Node: Gawk Distribution857477
Node: Getting857961
Node: Extracting858787
Node: Distribution contents860479
Node: Unix Installation865701
Node: Quick Installation866318
Node: Additional Configuration Options868280
Node: Configuration Philosophy869757
Node: NonUnix Installation872099
Node: PC Installation872557
Node: PC Binary Installation873856
Node: PC Compiling875704
Node: PC Testing878648
Node: PC Using879824
Node: Cygwin884009
Node: MSYS885009
Node: VMS Installation885523
Node: VMS Compilation886126
Ref: VMS CompilationFootnote1887133
Node: VMS Installation Details887191
Node: VMS Running888826
Node: VMS Old Gawk890433
Node: Bugs890907
Node: Other Versions894759
Node: Notes900074
Node: Compatibility Mode900766
Node: Additions901549
Node: Accessing The Source902361
Node: Adding Code903786
Node: New Ports909753
Node: Dynamic Extensions913866
Node: Internals915306
Node: Plugin License924128
Node: Loading Extensions924766
Node: Sample Library926605
Node: Internal File Description927295
Node: Internal File Ops931010
Ref: Internal File OpsFootnote1935752
Node: Using Internal File Ops935892
Node: Future Extensions938269
Node: Basic Concepts940773
Node: Basic High Level941530
Ref: Basic High LevelFootnote1945565
Node: Basic Data Typing945750
Node: Floating Point Issues950275
Node: String Conversion Precision951358
Ref: String Conversion PrecisionFootnote1953058
Node: Unexpected Results953167
Node: POSIX Floating Point Problems954993
Ref: POSIX Floating Point ProblemsFootnote1958698
Node: Glossary958736
Node: Copying983712
Node: GNU Free Documentation License1021269
Node: Index1046406
+Node: Foreword31919
+Node: Preface36264
+Ref: PrefaceFootnote139317
+Ref: PrefaceFootnote239423
+Node: History39655
+Node: Names42046
+Ref: NamesFootnote143523
+Node: This Manual43595
+Ref: This ManualFootnote148533
+Node: Conventions48633
+Node: Manual History50767
+Ref: Manual HistoryFootnote154037
+Ref: Manual HistoryFootnote254078
+Node: How To Contribute54152
+Node: Acknowledgments55296
+Node: Getting Started59792
+Node: Running gawk62171
+Node: Oneshot63357
+Node: Read Terminal64582
+Ref: Read TerminalFootnote166232
+Ref: Read TerminalFootnote266508
+Node: Long66679
+Node: Executable Scripts68055
+Ref: Executable ScriptsFootnote169924
+Ref: Executable ScriptsFootnote270026
+Node: Comments70573
+Node: Quoting73040
+Node: DOS Quoting77663
+Node: Sample Data Files78338
+Node: Very Simple81370
+Node: Two Rules85969
+Node: More Complex88116
+Ref: More ComplexFootnote191046
+Node: Statements/Lines91131
+Ref: Statements/LinesFootnote195593
+Node: Other Features95858
+Node: When96786
+Node: Invoking Gawk98933
+Node: Command Line100394
+Node: Options101177
+Ref: OptionsFootnote1115819
+Node: Other Arguments115844
+Node: Naming Standard Input118502
+Node: Environment Variables119596
+Node: AWKPATH Variable120154
+Ref: AWKPATH VariableFootnote1122743
+Node: AWKLIBPATH Variable123003
+Node: Other Environment Variables123600
+Node: Exit Status126095
+Node: Include Files126770
+Node: Loading Shared Libraries130271
+Node: Obsolete131496
+Node: Undocumented132193
+Node: Regexp132436
+Node: Regexp Usage133825
+Node: Escape Sequences135851
+Node: Regexp Operators141614
+Ref: Regexp OperatorsFootnote1148994
+Ref: Regexp OperatorsFootnote2149141
+Node: Bracket Expressions149239
+Ref: tablecharclasses151129
+Node: GNU Regexp Operators153652
+Node: Casesensitivity157375
+Ref: CasesensitivityFootnote1160343
+Ref: CasesensitivityFootnote2160578
+Node: Leftmost Longest160686
+Node: Computed Regexps161887
+Node: Reading Files165297
+Node: Records167300
+Ref: RecordsFootnote1175974
+Node: Fields176011
+Ref: FieldsFootnote1179044
+Node: Nonconstant Fields179130
+Node: Changing Fields181332
+Node: Field Separators187313
+Node: Default Field Splitting189942
+Node: Regexp Field Splitting191059
+Node: Single Character Fields194401
+Node: Command Line Field Separator195460
+Node: Field Splitting Summary198901
+Ref: Field Splitting SummaryFootnote1202093
+Node: Constant Size202194
+Node: Splitting By Content206778
+Ref: Splitting By ContentFootnote1210504
+Node: Multiple Line210544
+Ref: Multiple LineFootnote1216391
+Node: Getline216570
+Node: Plain Getline218786
+Node: Getline/Variable220875
+Node: Getline/File222016
+Node: Getline/Variable/File223338
+Ref: Getline/Variable/FileFootnote1224937
+Node: Getline/Pipe225024
+Node: Getline/Variable/Pipe227584
+Node: Getline/Coprocess228691
+Node: Getline/Variable/Coprocess229934
+Node: Getline Notes230648
+Node: Getline Summary232590
+Ref: tablegetlinevariants232933
+Node: Read Timeout233789
+Ref: Read TimeoutFootnote1237534
+Node: Command line directories237591
+Node: Printing238221
+Node: Print239852
+Node: Print Examples241189
+Node: Output Separators243973
+Node: OFMT245733
+Node: Printf247091
+Node: Basic Printf247997
+Node: Control Letters249536
+Node: Format Modifiers253348
+Node: Printf Examples259357
+Node: Redirection262072
+Node: Special Files269056
+Node: Special FD269589
+Ref: Special FDFootnote1273214
+Node: Special Network273288
+Node: Special Caveats274138
+Node: Close Files And Pipes274934
+Ref: Close Files And PipesFootnote1281957
+Ref: Close Files And PipesFootnote2282105
+Node: Expressions282255
+Node: Values283387
+Node: Constants284063
+Node: Scalar Constants284743
+Ref: Scalar ConstantsFootnote1285602
+Node: Nondecimalnumbers285784
+Node: Regexp Constants288843
+Node: Using Constant Regexps289318
+Node: Variables292373
+Node: Using Variables293028
+Node: Assignment Options294752
+Node: Conversion296624
+Ref: tablelocaleaffects302000
+Ref: ConversionFootnote1302624
+Node: All Operators302733
+Node: Arithmetic Ops303363
+Node: Concatenation305868
+Ref: ConcatenationFootnote1308661
+Node: Assignment Ops308781
+Ref: tableassignops313769
+Node: Increment Ops315177
+Node: Truth Values and Conditions318647
+Node: Truth Values319730
+Node: Typing and Comparison320779
+Node: Variable Typing321568
+Ref: Variable TypingFootnote1325465
+Node: Comparison Operators325587
+Ref: tablerelationalops325997
+Node: POSIX String Comparison329546
+Ref: POSIX String ComparisonFootnote1330502
+Node: Boolean Ops330640
+Ref: Boolean OpsFootnote1334718
+Node: Conditional Exp334809
+Node: Function Calls336541
+Node: Precedence340135
+Node: Locales343804
+Node: Patterns and Actions344893
+Node: Pattern Overview345947
+Node: Regexp Patterns347616
+Node: Expression Patterns348159
+Node: Ranges351844
+Node: BEGIN/END354810
+Node: Using BEGIN/END355572
+Ref: Using BEGIN/ENDFootnote1358303
+Node: I/O And BEGIN/END358409
+Node: BEGINFILE/ENDFILE360691
+Node: Empty363584
+Node: Using Shell Variables363900
+Node: Action Overview366185
+Node: Statements368542
+Node: If Statement370396
+Node: While Statement371895
+Node: Do Statement373939
+Node: For Statement375095
+Node: Switch Statement378247
+Node: Break Statement380344
+Node: Continue Statement382334
+Node: Next Statement384127
+Node: Nextfile Statement386517
+Node: Exit Statement389062
+Node: Builtin Variables391478
+Node: Usermodified392573
+Ref: UsermodifiedFootnote1400928
+Node: Autoset400990
+Ref: AutosetFootnote1410898
+Node: ARGC and ARGV411103
+Node: Arrays414954
+Node: Array Basics416459
+Node: Array Intro417285
+Node: Reference to Elements421603
+Node: Assigning Elements423873
+Node: Array Example424364
+Node: Scanning an Array426096
+Node: Controlling Scanning428410
+Ref: Controlling ScanningFootnote1433343
+Node: Delete433659
+Ref: DeleteFootnote1436094
+Node: Numeric Array Subscripts436151
+Node: Uninitialized Subscripts438334
+Node: Multidimensional439962
+Node: Multiscanning443056
+Node: Arrays of Arrays444647
+Node: Functions449292
+Node: Builtin450114
+Node: Calling Builtin451192
+Node: Numeric Functions453180
+Ref: Numeric FunctionsFootnote1457012
+Ref: Numeric FunctionsFootnote2457369
+Ref: Numeric FunctionsFootnote3457417
+Node: String Functions457686
+Ref: String FunctionsFootnote1481183
+Ref: String FunctionsFootnote2481312
+Ref: String FunctionsFootnote3481560
+Node: Gory Details481647
+Ref: tablesubescapes483326
+Ref: tablesubposix92484680
+Ref: tablesubproposed486023
+Ref: tableposixsub487373
+Ref: tablegensubescapes488919
+Ref: Gory DetailsFootnote1490126
+Ref: Gory DetailsFootnote2490177
+Node: I/O Functions490328
+Ref: I/O FunctionsFootnote1496983
+Node: Time Functions497130
+Ref: Time FunctionsFootnote1508022
+Ref: Time FunctionsFootnote2508090
+Ref: Time FunctionsFootnote3508248
+Ref: Time FunctionsFootnote4508359
+Ref: Time FunctionsFootnote5508471
+Ref: Time FunctionsFootnote6508698
+Node: Bitwise Functions508964
+Ref: tablebitwiseops509522
+Ref: Bitwise FunctionsFootnote1513682
+Node: Type Functions513866
+Node: I18N Functions514336
+Node: Userdefined515963
+Node: Definition Syntax516767
+Ref: Definition SyntaxFootnote1521677
+Node: Function Example521746
+Node: Function Caveats524340
+Node: Calling A Function524761
+Node: Variable Scope525876
+Node: Pass By Value/Reference527851
+Node: Return Statement531291
+Node: Dynamic Typing534272
+Node: Indirect Calls535007
+Node: Internationalization544692
+Node: I18N and L10N546131
+Node: Explaining gettext546817
+Ref: Explaining gettextFootnote1551883
+Ref: Explaining gettextFootnote2552067
+Node: Programmer i18n552232
+Node: Translator i18n556432
+Node: String Extraction557225
+Ref: String ExtractionFootnote1558186
+Node: Printf Ordering558272
+Ref: Printf OrderingFootnote1561056
+Node: I18N Portability561120
+Ref: I18N PortabilityFootnote1563569
+Node: I18N Example563632
+Ref: I18N ExampleFootnote1566267
+Node: Gawk I18N566339
+Node: Arbitrary Precision Arithmetic566956
+Ref: Arbitrary Precision ArithmeticFootnote1568608
+Node: General Arithmetic568756
+Node: Floating Point Issues570476
+Node: String Conversion Precision571571
+Ref: String Conversion PrecisionFootnote1573277
+Node: Unexpected Results573386
+Node: POSIX Floating Point Problems575224
+Ref: POSIX Floating Point ProblemsFootnote1579049
+Node: Integer Programming579087
+Node: Floatingpoint Programming580835
+Node: Floatingpoint Representation586757
+Node: Floatingpoint Context587924
+Ref: tableieeeformats588766
+Node: Rounding Mode590150
+Ref: tableroundingmodes590629
+Ref: Rounding ModeFootnote1593633
+Node: Gawk and MPFR593814
+Node: Arbitrary Precision Floats595055
+Ref: Arbitrary Precision FloatsFootnote1597477
+Node: Setting Precision597788
+Node: Setting Rounding Mode600515
+Ref: tablegawkroundingmodes600919
+Node: Floatingpoint Constants602116
+Node: Changing Precision603538
+Ref: Changing PrecisionFootnote1604938
+Node: Exact Arithmetic605112
+Node: Arbitrary Precision Integers608210
+Ref: Arbitrary Precision IntegersFootnote1611292
+Node: Advanced Features611439
+Node: Nondecimal Data612962
+Node: Array Sorting614545
+Node: Controlling Array Traversal615242
+Node: Array Sorting Functions623479
+Ref: Array Sorting FunctionsFootnote1627153
+Ref: Array Sorting FunctionsFootnote2627246
+Node: Twoway I/O627440
+Ref: Twoway I/OFootnote1632872
+Node: TCP/IP Networking632942
+Node: Profiling635786
+Node: Library Functions643240
+Ref: Library FunctionsFootnote1646247
+Node: Library Names646418
+Ref: Library NamesFootnote1649889
+Ref: Library NamesFootnote2650109
+Node: General Functions650195
+Node: Strtonum Function651148
+Node: Assert Function654078
+Node: Round Function657404
+Node: Cliff Random Function658947
+Node: Ordinal Functions659963
+Ref: Ordinal FunctionsFootnote1663033
+Ref: Ordinal FunctionsFootnote2663285
+Node: Join Function663494
+Ref: Join FunctionFootnote1665265
+Node: Gettimeofday Function665465
+Node: Data File Management669180
+Node: Filetrans Function669812
+Node: Rewind Function673951
+Node: File Checking675338
+Node: Empty Files676432
+Node: Ignoring Assigns678662
+Node: Getopt Function680215
+Ref: Getopt FunctionFootnote1691519
+Node: Passwd Functions691722
+Ref: Passwd FunctionsFootnote1700697
+Node: Group Functions700785
+Node: Walking Arrays708869
+Node: Sample Programs710438
+Node: Running Examples711103
+Node: Clones711831
+Node: Cut Program713055
+Node: Egrep Program722900
+Ref: Egrep ProgramFootnote1730673
+Node: Id Program730783
+Node: Split Program734399
+Ref: Split ProgramFootnote1737918
+Node: Tee Program738046
+Node: Uniq Program740849
+Node: Wc Program748278
+Ref: Wc ProgramFootnote1752544
+Ref: Wc ProgramFootnote2752744
+Node: Miscellaneous Programs752836
+Node: Dupword Program754024
+Node: Alarm Program756055
+Node: Translate Program760804
+Ref: Translate ProgramFootnote1765191
+Ref: Translate ProgramFootnote2765419
+Node: Labels Program765553
+Ref: Labels ProgramFootnote1768924
+Node: Word Sorting769008
+Node: History Sorting772892
+Node: Extract Program774731
+Ref: Extract ProgramFootnote1782214
+Node: Simple Sed782342
+Node: Igawk Program785404
+Ref: Igawk ProgramFootnote1800561
+Ref: Igawk ProgramFootnote2800762
+Node: Anagram Program800900
+Node: Signature Program803968
+Node: Debugger805068
+Node: Debugging806020
+Node: Debugging Concepts806453
+Node: Debugging Terms808309
+Node: Awk Debugging810906
+Node: Sample Debugging Session811798
+Node: Debugger Invocation812318
+Node: Finding The Bug813647
+Node: List of Debugger Commands820135
+Node: Breakpoint Control821469
+Node: Debugger Execution Control825133
+Node: Viewing And Changing Data828493
+Node: Execution Stack831849
+Node: Debugger Info833316
+Node: Miscellaneous Debugger Commands837297
+Node: Readline Support842742
+Node: Limitations843573
+Node: Language History845825
+Node: V7/SVR3.1847337
+Node: SVR4849658
+Node: POSIX851100
+Node: BTL852108
+Node: POSIX/GNU852842
+Node: Common Extensions858133
+Node: Ranges and Locales859240
+Ref: Ranges and LocalesFootnote1863844
+Node: Contributors864065
+Node: Installation868326
+Node: Gawk Distribution869220
+Node: Getting869704
+Node: Extracting870530
+Node: Distribution contents872222
+Node: Unix Installation877444
+Node: Quick Installation878061
+Node: Additional Configuration Options880023
+Node: Configuration Philosophy881500
+Node: NonUnix Installation883842
+Node: PC Installation884300
+Node: PC Binary Installation885599
+Node: PC Compiling887447
+Node: PC Testing890391
+Node: PC Using891567
+Node: Cygwin895752
+Node: MSYS896752
+Node: VMS Installation897266
+Node: VMS Compilation897869
+Ref: VMS CompilationFootnote1898876
+Node: VMS Installation Details898934
+Node: VMS Running900569
+Node: VMS Old Gawk902176
+Node: Bugs902650
+Node: Other Versions906502
+Node: Notes911817
+Node: Compatibility Mode912509
+Node: Additions913292
+Node: Accessing The Source914104
+Node: Adding Code915529
+Node: New Ports921496
+Node: Dynamic Extensions925609
+Node: Internals927049
+Node: Plugin License935871
+Node: Loading Extensions936509
+Node: Sample Library938348
+Node: Internal File Description939038
+Node: Internal File Ops942753
+Ref: Internal File OpsFootnote1947495
+Node: Using Internal File Ops947635
+Node: Future Extensions950012
+Node: Basic Concepts952516
+Node: Basic High Level953197
+Ref: Basic High LevelFootnote1957232
+Node: Basic Data Typing957417
+Node: Glossary960772
+Node: Copying985748
+Node: GNU Free Documentation License1023305
+Node: Index1048442
End Tag Table
diff git a/doc/gawk.texi b/doc/gawk.texi
index d3f5c67..672b6f3 100644
 a/doc/gawk.texi
+++ b/doc/gawk.texi
@@ 558,21 +558,29 @@ particular records in a file and perform operations upon
them.
* I18N Portability:: @command{awk}level portability issues.
* I18N Example:: A simple i18n example.
* Gawk I18N:: @command{gawk} is also internationalized.
+* General Arithmetic:: An introduction to computer arithmetic.
+* Floating Point Issues:: Stuff to know about floatingpoint numbers.
+* String Conversion Precision:: The String Value Can Lie.
+* Unexpected Results:: Floating Point Numbers Are Not Abstract
+ Numbers.
+* POSIX Floating Point Problems:: Standards Versus Existing Practice.
+* Integer Programming:: Effective integer programming.
* Floatingpoint Programming:: Effective floatingpoint programming.
* Floatingpoint Representation:: Binary floatingpoint representation.
* Floatingpoint Context:: Floatingpoint context.
* Rounding Mode:: Floatingpoint rounding mode.
+* Gawk and MPFR:: How @command{gawk} provides
+ aribitraryprecision arithmetic.
* Arbitrary Precision Floats:: Arbitrary precision floatingpoint
arithmetic with @command{gawk}.
* Setting Precision:: Setting the working precision.
* Setting Rounding Mode:: Setting the rounding mode.
* Floatingpoint Constants:: Representing floatingpoint constants.
* Changing Precision:: Changing the precision of a number.
* Exact Arithmetic:: Exact arithmetic with floatingpoint
numbers.
* Integer Programming:: Effective integer programming.
* Arbitrary Precision Integers:: Arbitrary precision integer
 arithmetic with @command{gawk}.
* MPFR and GMP Libraries:: Information about the MPFR and GMP
libraries.
+* Exact Arithmetic:: Exact arithmetic with floatingpoint
+ numbers.
+* Arbitrary Precision Integers:: Arbitrary precision integer arithmetic with
+ @command{gawk}.
* Nondecimal Data:: Allowing nondecimal input data.
* Array Sorting:: Facilities for controlling array traversal
and sorting arrays.
@@ 637,14 +645,14 @@ particular records in a file and perform operations upon
them.
* Anagram Program:: Finding anagrams from a dictionary.
* Signature Program:: People do amazing things with too much time
on their hands.
* Debugging:: Introduction to @command{gawk} Debugger.
+* Debugging:: Introduction to @command{gawk} debugger.
* Debugging Concepts:: Debugging in General.
* Debugging Terms:: Additional Debugging Concepts.
* Awk Debugging:: Awk Debugging.
* Sample Debugging Session:: Sample Debugging Session.
+* Sample Debugging Session:: Sample debugging session.
* Debugger Invocation:: How to Start the Debugger.
* Finding The Bug:: Finding the Bug.
* List of Debugger Commands:: Main Commands.
+* List of Debugger Commands:: Main debugger commands.
* Breakpoint Control:: Control of Breakpoints.
* Debugger Execution Control:: Control of Execution.
* Viewing And Changing Data:: Viewing and Changing Data.
@@ 652,8 +660,8 @@ particular records in a file and perform operations upon
them.
* Debugger Info:: Obtaining Information about the Program and
the Debugger State.
* Miscellaneous Debugger Commands:: Miscellaneous Commands.
* Readline Support:: Readline Support.
* Limitations:: Limitations and Future Plans.
+* Readline Support:: Readline support.
+* Limitations:: Limitations and future plans.
* V7/SVR3.1:: The major changes between V7 and System V
Release 3.1.
* SVR4:: Minor changes between System V Releases 3.1
@@ 718,11 +726,6 @@ particular records in a file and perform operations upon
them.
day.
* Basic High Level:: The high level view.
* Basic Data Typing:: A very quick intro to data types.
* Floating Point Issues:: Stuff to know about floatingpoint numbers.
* String Conversion Precision:: The String Value Can Lie.
* Unexpected Results:: Floating Point Numbers Are Not Abstract
 Numbers.
* POSIX Floating Point Problems:: Standards Versus Existing Practice.
@end detailmenu
@end menu
@@ 3600,8 +3603,8 @@ behaves.
@menu
* AWKPATH Variable:: Searching directories for @command{awk}
programs.
* AWKLIBPATH Variable:: Searching directories for @command{awk}
 shared libraries.
+* AWKLIBPATH Variable:: Searching directories for @command{awk} shared
+ libraries.
* Other Environment Variables:: The environment variables.
@end menu
@@ 5242,7 +5245,6 @@ used with it do not have to be named on the @command{awk}
command line
* Getline:: Reading files under explicit program control
using the @code{getline} function.
* Read Timeout:: Reading input with a timeout.

* Command line directories:: What happens if you put a directory on the
command line.
@end menu
@@ 18433,7 +18435,7 @@ and fatal errors in the local language.
@c ENDOFRANGE inloc
@node Arbitrary Precision Arithmetic
address@hidden Arbitrary Precision Arithmetic with @command{gawk}
address@hidden Arithmetic and Arbitrary Precision Arithmetic with @command{gawk}
@cindex arbitrary precision
@cindex multiple precision
@cindex infinite precision
@@ 18448,204 +18450,534 @@ to believe. Novice computer users solve this
problem by implicitly trusting
in the computer as an infallible authority; they tend to believe that all
digits of a printed answer are significant. Disillusioned computer users have
just the opposite approach; they are constantly afraid that their answers
are almost meaningless.}

+are almost address@hidden
Donald address@hidden E.@: Knuth.
@cite{The Art of Computer Programming}. Volume 2,
@cite{Seminumerical Algorithms}, third edition,
1998, ISBN 0201896834, p.@: 229.}
@end quotation
This @value{SECTION} decsribes how to use the arbitrary precision
(also known as @dfn{multiple precision} or @dfn{infinite precision}) numeric
capabilites in @command{gawk} to produce maximally accurate results
when you need it. But first you should check if your version of
address@hidden supports arbitrary precision arithmetic.
The easiest way to find out is to look at the output of
the following command:

address@hidden
$ @kbd{gawk version}
address@hidden GNU Awk 4.1.0 (GNU MPFR 3.1.0, GNU MP 5.0.3)
address@hidden Copyright (C) 1989, 19912012 Free Software Foundation.
address@hidden
address@hidden example

address@hidden uses the
address@hidden://www.mpfr.org, GNU MPFR}
and
address@hidden://gmplib.org, GNU MP} (GMP)
libraries for arbitrary precision
arithmetic on numbers. So if you do not see the names of these libraries
in the output, then your version of @command{gawk} does not support
arbitrary precision arithmetic.
+This @value{CHAPTER} discusses issues that you may encounter
+when performing arithmetic. It begins by discussing some of
+the general atributes of computer arithmetic, along with how
+this can influence what you see when running @command{awk} programs.
+This discussion applies to all versions of @command{awk}.
Even if you aren't interested in arbitrary precision arithmetic, you
may still benifit from knowing about how @command{gawk} handles numbers
in general, and the limitations of doing arithmetic with ordinary
address@hidden numbers.
+Then the discussion moves on to @dfn{arbitrary precsion
+arithmetic}, a feature which is specific to @command{gawk}.
@menu
* Floatingpoint Programming:: Effective Floatingpoint Programming.
* Floatingpoint Representation:: Binary Floatingpoint Representation.
* Floatingpoint Context:: Floatingpoint Context.
* Rounding Mode:: Floatingpoint Rounding Mode.
* Arbitrary Precision Floats:: Arbitrary Precision Floatingpoint
 Arithmetic with @command{gawk}.
* Setting Precision:: Setting the Working Precision.
* Setting Rounding Mode:: Setting the Rounding Mode.
* Floatingpoint Constants:: Representing Floatingpoint Constants.
* Changing Precision:: Changing the Precision of a Number.
* Exact Arithmetic:: Exact Arithmetic with Floatingpoint
Numbers.
* Integer Programming:: Effective Integer Programming.
* Arbitrary Precision Integers:: Arbitrary Precision Integer
 Arithmetic with @command{gawk}.
* MPFR and GMP Libraries:: Information About the MPFR and GMP
Libraries.
+* General Arithmetic:: An introduction to computer arithmetic.
+* Floatingpoint Programming:: Effective floatingpoint programming.
+* Gawk and MPFR:: How @command{gawk} provides
+ aribitraryprecision arithmetic.
+* Arbitrary Precision Floats:: Arbitrary precision floatingpoint arithmetic
+ with @command{gawk}.
+* Arbitrary Precision Integers:: Arbitrary precision integer arithmetic with
+ @command{gawk}.
@end menu
address@hidden Floatingpoint Programming
address@hidden Effective Floatingpoint Programming
address@hidden General Arithmetic
address@hidden A General Description of Computer Arithmetic
Numerical programming is an extensive area; if you need to develop
sophisticated numerical algorithms then @command{gawk} may not be
the ideal tool, and this documentation may not be sufficient.
address@hidden FIXME: JOHN: Do you want to cite some actual books?
It might require a book or two to communicate how to compute
with ideal accuracy and precision
and the result often depends on the particular application.
address@hidden integers
address@hidden floatingpoint, numbers
address@hidden numbers, floatingpoint
+Within computers, there are two kinds of numeric values: @dfn{integers}
+and @dfn{floatingpoint}.
+In school, integer values were referred to as ``whole'' numbersthat is,
+numbers without any fractional part, such as 1, 42, or @minus{}17.
+The advantage to integer numbers is that they represent values exactly.
+The disadvantage is that their range is limited. On most systems,
+this range is @minus{}2,147,483,648 to 2,147,483,647.
+However, many systems now support a range from
address@hidden,223,372,036,854,775,808 to 9,223,372,036,854,775,807.
address@hidden NOTE
A floatingpoint calculation's @dfn{accuracy} is how close it comes
to the real value. This is as opposed to the @dfn{precision}, which
usually refers to the number of bits used to represent the number
(see @uref{http://en.wikipedia.org/wiki/Accuracy_and_precision,
the Wikipedia article} for more information).
address@hidden quotation
address@hidden unsigned integers
address@hidden integers, unsigned
+Integer values come in two flavors: @dfn{signed} and @dfn{unsigned}.
+Signed values may be negative or positive, with the range of values just
+described.
+Unsigned values are always positive. On most systems,
+the range is from 0 to 4,294,967,295.
+However, many systems now support a range from
+0 to 18,446,744,073,709,551,615.
Binary floatingpoint representations and arithmetic are inexact.
Simple values like 0.1 cannot be precisely represented using
binary floatingpoint numbers, and the limited precision of
floatingpoint numbers means that slight changes in
the order of operations or the precision of intermediate storage
can change the result. To make matters worse with arbitrary precision
floatingpoint, you can set the precision before starting a computation,
but then you cannot be sure of the number of significant decimal places
in the final result.
address@hidden double precision floatingpoint
address@hidden single precision floatingpoint
+Floatingpoint numbers represent what are called ``real'' numbers; i.e.,
+those that do have a fractional part, such as 3.1415927.
+The advantage to floatingpoint numbers is that they
+can represent a much larger range of values.
+The disadvantage is that there are numbers that they cannot represent
+exactly.
address@hidden uses @dfn{double precision} floatingpoint numbers, which
+can hold more digits than @dfn{single precision}
+floatingpoint numbers.
address@hidden Floatingpoint issues are discussed more fully in
address@hidden @ref{Floating Point Issues}.
Sometimes you need to think more about what you really want
and what's really happening. Consider the two numbers
in the following example:
+There a several important issues to be aware of, described next.
address@hidden
x = 0.875 # 1/2 + 1/4 + 1/8
y = 0.425
address@hidden example
address@hidden
+* Floating Point Issues:: Stuff to know about floatingpoint numbers.
+* Integer Programming:: Effective integer programming.
address@hidden menu
Unlike the number in @code{y}, the number stored in @code{x}
is exactly representable
in binary since it can be written as a finite sum of one or
more fractions whose denominators are all powers of two.
When @command{gawk} reads a floatingpoint number from
program source, it automatically rounds that number to whatever
precision your machine supports. If you try to print the numeric
content of a variable using an output format string of @code{"%.17g"},
it may not produce the same number as you assigned to it:
address@hidden Floating Point Issues
address@hidden FloatingPoint Number Caveats
address@hidden
$ @kbd{gawk 'BEGIN @{ x = 0.875; y = 0.425}
> @kbd{ printf("%0.17g, %0.17g\n", x, y) @}'}
address@hidden 0.875, 0.42499999999999999
address@hidden example
+As mentioned earlier, floatingpoint numbers represent what are called
+``real'' numbers, i.e., those that have a fractional part. @command{awk}
+uses double precision floatingpoint numbers to represent all
+numeric values. This @value{SECTION} describes some of the issues
+involved in using floatingpoint numbers.
Often the error is so small you do not even notice it, and if you do,
you can always specify how much precision you would like in your output.
Usually this is a format string like @code{"%.15g"}, which when
used in the previous example, produces an output identical to the input.
+There is a very nice
address@hidden://www.validlab.com/goldberg/paper.pdf, paper on floatingpoint
arithmetic}
+by David Goldberg,
+``What Every Computer Scientist Should Know About Floatingpoint Arithmetic,''
address@hidden Computing Surveys} @strong{23}, 1 (199103), 548.
+This is worth reading if you are interested in the details,
+but it does require a background in computer science.
Because the underlying representation can be little bit off from the exact
value,
comparing floats to see if they are equal is generally not a good idea.
Here is an example where it does not work like you expect:
address@hidden
+* String Conversion Precision:: The String Value Can Lie.
+* Unexpected Results:: Floating Point Numbers Are Not Abstract
+ Numbers.
+* POSIX Floating Point Problems:: Standards Versus Existing Practice.
address@hidden menu
address@hidden
$ @kbd{gawk 'BEGIN @{ print (0.1 + 12.2 == 12.3) @}'}
address@hidden 0
address@hidden example
address@hidden String Conversion Precision
address@hidden The String Value Can Lie
The loss of accuracy during a single computation with floatingpoint numbers
usually isn't enough to worry about. However, if you compute a value
which is the result of a sequence of floating point operations,
the error can accumulate and greatly affect the computation itself.
Here is an attempt to compute the value of the constant
address@hidden using one of its many series representations:
+Internally, @command{awk} keeps both the numeric value
+(double precision floatingpoint) and the string value for a variable.
+Separately, @command{awk} keeps
+track of what type the variable has
+(@pxref{Typing and Comparison}),
+which plays a role in how variables are used in comparisons.
+
+It is important to note that the string value for a number may not
+reflect the full value (all the digits) that the numeric value
+actually contains.
+The following program (@file{values.awk}) illustrates this:
@example
BEGIN @{
 x = 1.0 / sqrt(3.0)
 n = 6
 for (i = 1; i < 30; i++) @{
 n = n * 2.0
 x = (sqrt(x * x + 1)  1) / x
 printf("%.15f\n", n * x)
 @}
address@hidden
+ sum = $1 + $2
+ # see it for what it is
+ printf("sum = %.12g\n", sum)
+ # use CONVFMT
+ a = "<" sum ">"
+ print "a =", a
+ # use OFMT
+ print "sum =", sum
@}
@end example
When run, the early errors propagating through later computations
cause the loop to terminate prematurely after an attempt to divide by zero.
address@hidden
+This program shows the full value of the sum of @code{$1} and @code{$2}
+using @code{printf}, and then prints the string values obtained
+from both automatic conversion (via @code{CONVFMT}) and
+from printing (via @code{OFMT}).
+
+Here is what happens when the program is run:
@example
$ @kbd{gawk f pi.awk}
address@hidden 3.215390309173475
address@hidden 3.159659942097510
address@hidden 3.146086215131467
address@hidden 3.142714599645573
address@hidden
address@hidden 3.224515243534819
address@hidden 2.791117213058638
address@hidden 0.000000000000000
address@hidden gawk: pi.awk:6: fatal: division by zero attempted
+$ @kbd{echo 3.654321 1.2345678  awk f values.awk}
address@hidden sum = 4.8888888
address@hidden a = <4.88889>
address@hidden sum = 4.88889
@end example
Here is one more example where the inaccuracies in internal representations
yield an unexpected result:
+This makes it clear that the full numeric value is different from
+what the default string representations show.
address@hidden
$ @kbd{gawk 'BEGIN @{}
> @kbd{for (d = 1.1; d <= 1.5; d += 0.1)}
> @kbd{i++}
> @kbd{print i}
> @address@hidden'}
address@hidden 4
address@hidden example
address@hidden's default value is @code{"%.6g"}, which yields a value with
+at least six significant digits. For some applications, you might want to
+change it to specify more precision.
+On most modern machines, most of the time,
+17 digits is enough to capture a floatingpoint number's
+value address@hidden cases can require up to
+752 digits (!), but we doubt that you need to worry about this.}
Can computation using aribitrary precision help with the previous examples?
If you are impatient to know, see
address@hidden Arithmetic}.
address@hidden Unexpected Results
address@hidden Floating Point Numbers Are Not Abstract Numbers
Instead of aribitrary precision floatingpoint arithmetic,
often all you need is an adjustment of your logic
or a different order for the operations in your calculation.
The stability and the accuracy of the computation of the constant @value{PI}
in the previous example can be enhanced by using the following
simple algebraic transformation:
address@hidden floatingpoint, numbers
+Unlike numbers in the abstract sense (such as what you studied in high school
+or college arithmetic), numbers stored in computers are limited in certain
ways.
+They cannot represent an infinite number of digits, nor can they always
+represent things exactly.
+In particular,
+floatingpoint numbers cannot
+always represent values exactly. Here is an example:
@example
(sqrt(x * x + 1)  1) / x = x / (sqrt(x * x + 1) + x)
+$ @kbd{awk '@{ printf("%010d\n", $1 * 100) @}'}
+515.79
address@hidden 0000051579
+515.80
address@hidden 0000051579
+515.81
address@hidden 0000051580
+515.82
address@hidden 0000051582
address@hidden@value{CTL}d}
@end example
There is no need to be unduly suspicious about the results from
floatingpoint arithmetic. The lesson to remember is that
floatingpoint math is always more complex than the math using
pencil and paper. In order to take advantage of the power
of computer floatingpoint, you need to know its limitations
and work within them. For most casual use of floatingpoint arithmetic,
address@hidden
+This shows that some values can be represented exactly,
+whereas others are only approximated. This is not a ``bug''
+in @command{awk}, but simply an artifact of how computers
+represent numbers.
+
address@hidden negative zero
address@hidden positive zero
address@hidden address@hidden negative vs.@: positive
+Another peculiarity of floatingpoint numbers on modern systems
+is that they often have more than one representation for the number zero!
+In particular, it is possible to represent ``minus zero'' as well as
+regular, or ``positive'' zero.
+
+This example shows that negative and positive zero are distinct values
+when stored internally, but that they are in fact equal to each other,
+as well as to ``regular'' zero:
+
address@hidden
+$ @kbd{gawk 'BEGIN @{ mz = 0 ; pz = 0}
+> @kbd{printf "0 = %g, +0 = %g, (0 == +0) > %d\n", mz, pz, mz == pz}
+> @kbd{printf "mz == 0 > %d, pz == 0 > %d\n", mz == 0, pz == 0}
+> @address@hidden'}
address@hidden 0 = 0, +0 = 0, (0 == +0) > 1
address@hidden mz == 0 > 1, pz == 0 > 1
address@hidden example
+
+It helps to keep this in mind should you process numeric data
+that contains negative zero values; the fact that the zero is negative
+is noted and can affect comparisons.
+
address@hidden POSIX Floating Point Problems
address@hidden Standards Versus Existing Practice
+
+Historically, @command{awk} has converted any nonnumeric looking string
+to the numeric value zero, when required. Furthermore, the original
+definition of the language and the original POSIX standards specified that
address@hidden only understands decimal numbers (base 10), and not octal
+(base 8) or hexadecimal numbers (base 16).
+
+Changes in the language of the
+2001 and 2004 POSIX standards can be interpreted to imply that @command{awk}
+should support additional features. These features are:
+
address@hidden @bullet
address@hidden
+Interpretation of floating point data values specified in hexadecimal
+notation (@samp{0xDEADBEEF}). (Note: data values, @emph{not}
+source code constants.)
+
address@hidden
+Support for the special IEEE 754 floating point values ``Not A Number''
+(NaN), positive Infinity (``inf'') and negative Infinity (address@hidden'').
+In particular, the format for these values is as specified by the ISO 1999
+C standard, which ignores case and can allow machinedependent additional
+characters after the @samp{nan} and allow either @samp{inf} or @samp{infinity}.
address@hidden itemize
+
+The first problem is that both of these are clear changes to historical
+practice:
+
address@hidden @bullet
address@hidden
+The @command{gawk} maintainer feels that supporting hexadecimal floating
+point values, in particular, is ugly, and was never intended by the
+original designers to be part of the language.
+
address@hidden
+Allowing completely alphabetic strings to have valid numeric
+values is also a very severe departure from historical practice.
address@hidden itemize
+
+The second problem is that the @code{gawk} maintainer feels that this
+interpretation of the standard, which requires a certain amount of
+``language lawyering'' to arrive at in the first place, was not even
+intended by the standard developers. In other words, ``we see how you
+got where you are, but we don't think that that's where you want to be.''
+
+Recognizing the above issues, but attempting to provide compatibility
+with the earlier versions of the standard,
+the 2008 POSIX standard added explicit wording to allow, but not require,
+that @command{awk} support hexadecimal floating point values and
+special values for ``Not A Number'' and infinity.
+
+Although the @command{gawk} maintainer continues to feel that
+providing those features is inadvisable,
+nevertheless, on systems that support IEEE floating point, it seems
+reasonable to provide @emph{some} way to support NaN and Infinity values.
+The solution implemented in @command{gawk} is as follows:
+
address@hidden @bullet
address@hidden
+With the @option{posix} commandline option, @command{gawk} becomes
+``hands off.'' String values are passed directly to the system library's
address@hidden()} function, and if it successfully returns a numeric value,
+that is what's address@hidden asked for it, you got it.}
+By definition, the results are not portable across
+different systems. They are also a little surprising:
+
address@hidden
+$ @kbd{echo nanny  gawk posix '@{ print $1 + 0 @}'}
address@hidden nan
+$ @kbd{echo 0xDeadBeef  gawk posix '@{ print $1 + 0 @}'}
address@hidden 3735928559
address@hidden example
+
address@hidden
+Without @option{posix}, @command{gawk} interprets the four strings
address@hidden,
address@hidden,
address@hidden,
+and
address@hidden
+specially, producing the corresponding special numeric values.
+The leading sign acts a signal to @command{gawk} (and the user)
+that the value is really numeric. Hexadecimal floating point is
+not supported (unless you also use @option{nondecimaldata},
+which is @emph{not} recommended). For example:
+
address@hidden
+$ @kbd{echo nanny  gawk '@{ print $1 + 0 @}'}
address@hidden 0
+$ @kbd{echo +nan  gawk '@{ print $1 + 0 @}'}
address@hidden nan
+$ @kbd{echo 0xDeadBeef  gawk '@{ print $1 + 0 @}'}
address@hidden 0
address@hidden example
+
address@hidden does ignore case in the four special values.
+Thus @samp{+nan} and @samp{+NaN} are the same.
address@hidden itemize
+
address@hidden Integer Programming
address@hidden Mixing Integers And Floatingpoint
+
+As has been mentioned already, @command{gawk} ordinarily uses hardware double
+precision with 64bit IEEE binary floatingpoint representation
+for numbers on most systems. A large integer like 9007199254740997
+has a binary representation that, although finite, is more than 53 bits long;
+it must also be rounded to 53 bits.
+The biggest integer that can be stored in a C @code{double} is usually the same
+as the largest possible value of a @code{double}. If your system @code{double}
+is an IEEE 64bit @code{double}, this largest possible value is an integer and
+can be represented precisely. What more should one know about integers?
+
+If you want to know what is the largest integer, such that it and
+all smaller integers can be stored in 64bit doubles without losing precision,
+then the answer is
address@hidden
address@hidden
address@hidden iftex
address@hidden
+2^53.
address@hidden ifnottex
+The next representable number is the even number
address@hidden
address@hidden + 2},
address@hidden iftex
address@hidden
+2^53 + 2,
address@hidden ifnottex
+meaning it is unlikely that you will be able to make
address@hidden print
address@hidden
address@hidden + 1}
address@hidden iftex
address@hidden
+2^53 + 1
address@hidden ifnottex
+in integer format.
+The range of integers exactly representable by a 64bit double
+is
address@hidden
address@hidden, 2^{53}]}.
address@hidden iftex
address@hidden
address@hidden, 2^53].
address@hidden ifnottex
+If you ever see an integer outside this range in @command{gawk}
+using 64bit doubles, you have reason to be very suspicious about
+the accuracy of the output. Here is a simple program with erroneous output:
+
address@hidden
+$ @kbd{gawk 'BEGIN @{ i = 2^53  1; for (j = 0; j < 4; j++) print i + j @}'}
address@hidden 9007199254740991
address@hidden 9007199254740992
address@hidden 9007199254740992
address@hidden 9007199254740994
address@hidden example
+
+The lesson is to not assume that any large integer printed by @command{gawk}
+represents an exact result from your computation, especially if it wraps
+around on your screen.
+
address@hidden Floatingpoint Programming
address@hidden Understanding Floatingpoint Programming
+
+Numerical programming is an extensive area; if you need to develop
+sophisticated numerical algorithms then @command{gawk} may not be
+the ideal tool, and this documentation may not be sufficient.
address@hidden FIXME: JOHN: Do you want to cite some actual books?
+It might require digesting a book or two to really internalize how to compute
+with ideal accuracy and precision
+and the result often depends on the particular application.
+
address@hidden NOTE
+A floatingpoint calculation's @dfn{accuracy} is how close it comes
+to the real value. This is as opposed to the @dfn{precision}, which
+usually refers to the number of bits used to represent the number
+(see @uref{http://en.wikipedia.org/wiki/Accuracy_and_precision,
+the Wikipedia article} for more information).
address@hidden quotation
+
+There are two options for doing floatingpoint calculations:
+hardware floatingpoint (as used by standard @command{awk} and
+the default for @command{gawk}), and @dfn{arbitraryprecision}
+floatingpoint, which is software based. This @value{CHAPTER}
+aims to provide enough information to understand both, and then
+will focus on @command{gawk}'s facilities for the latter.
+
+Binary floatingpoint representations and arithmetic are inexact.
+Simple values like 0.1 cannot be precisely represented using
+binary floatingpoint numbers, and the limited precision of
+floatingpoint numbers means that slight changes in
+the order of operations or the precision of intermediate storage
+can change the result. To make matters worse, with arbitrary precision
+floatingpoint, you can set the precision before starting a computation,
+but then you cannot be sure of the number of significant decimal places
+in the final result.
+
+Sometimes, before you start to write any code, you should think more
+about what you really want and what's really happening. Consider the
+two numbers in the following example:
+
address@hidden
+x = 0.875 # 1/2 + 1/4 + 1/8
+y = 0.425
address@hidden example
+
+Unlike the number in @code{y}, the number stored in @code{x}
+is exactly representable
+in binary since it can be written as a finite sum of one or
+more fractions whose denominators are all powers of two.
+When @command{gawk} reads a floatingpoint number from
+program source, it automatically rounds that number to whatever
+precision your machine supports. If you try to print the numeric
+content of a variable using an output format string of @code{"%.17g"},
+it may not produce the same number as you assigned to it:
+
address@hidden
+$ @kbd{gawk 'BEGIN @{ x = 0.875; y = 0.425}
+> @kbd{ printf("%0.17g, %0.17g\n", x, y) @}'}
address@hidden 0.875, 0.42499999999999999
address@hidden example
+
+Often the error is so small you do not even notice it, and if you do,
+you can always specify how much precision you would like in your output.
+Usually this is a format string like @code{"%.15g"}, which when
+used in the previous example, produces an output identical to the input.
+
+Because the underlying representation can be little bit off from the exact
value,
+comparing floatingpoint values to see if they are equal is generally not a
good idea.
+Here is an example where it does not work like you expect:
+
address@hidden
+$ @kbd{gawk 'BEGIN @{ print (0.1 + 12.2 == 12.3) @}'}
address@hidden 0
address@hidden example
+
+The loss of accuracy during a single computation with floatingpoint numbers
+usually isn't enough to worry about. However, if you compute a value
+which is the result of a sequence of floating point operations,
+the error can accumulate and greatly affect the computation itself.
+Here is an attempt to compute the value of the constant
address@hidden using one of its many series representations:
+
address@hidden
+BEGIN @{
+ x = 1.0 / sqrt(3.0)
+ n = 6
+ for (i = 1; i < 30; i++) @{
+ n = n * 2.0
+ x = (sqrt(x * x + 1)  1) / x
+ printf("%.15f\n", n * x)
+ @}
address@hidden
address@hidden example
+
+When run, the early errors propagating through later computations
+cause the loop to terminate prematurely after an attempt to divide by zero.
+
address@hidden
+$ @kbd{gawk f pi.awk}
address@hidden 3.215390309173475
address@hidden 3.159659942097510
address@hidden 3.146086215131467
address@hidden 3.142714599645573
address@hidden
address@hidden 3.224515243534819
address@hidden 2.791117213058638
address@hidden 0.000000000000000
address@hidden gawk: pi.awk:6: fatal: division by zero attempted
address@hidden example
+
+Here is one more example where the inaccuracies in internal representations
+yield an unexpected result:
+
address@hidden
+$ @kbd{gawk 'BEGIN @{}
+> @kbd{for (d = 1.1; d <= 1.5; d += 0.1)}
+> @kbd{i++}
+> @kbd{print i}
+> @address@hidden'}
address@hidden 4
address@hidden example
+
+Can computation using aribitrary precision help with the previous examples?
+If you are impatient to know, see
address@hidden Arithmetic}.
+
+Instead of aribitrary precision floatingpoint arithmetic,
+often all you need is an adjustment of your logic
+or a different order for the operations in your calculation.
+The stability and the accuracy of the computation of the constant @value{PI}
+in the previous example can be enhanced by using the following
+simple algebraic transformation:
+
address@hidden
+(sqrt(x * x + 1)  1) / x = x / (sqrt(x * x + 1) + x)
address@hidden example
address@hidden FIXME: Show new program and results
+
+There is no need to be unduly suspicious about the results from
+floatingpoint arithmetic. The lesson to remember is that
+floatingpoint arithmetic is always more complex than the arithmetic using
+pencil and paper. In order to take advantage of the power
+of computer floatingpoint, you need to know its limitations
+and work within them. For most casual use of floatingpoint arithmetic,
you will often get the expected result in the end if you simply round
the display of your final results to the correct number of significant
decimal digits. Avoid presenting numerical data in a manner that
+decimal digits. And, avoid presenting numerical data in a manner that
implies better precision than is actually the case.
address@hidden
+* Floatingpoint Representation:: Binary floatingpoint representation.
+* Floatingpoint Context:: Floatingpoint context.
+* Rounding Mode:: Floatingpoint rounding mode.
address@hidden menu
+
@node Floatingpoint Representation
address@hidden Binary Floatingpoint Representation
address@hidden Binary Floatingpoint Representation
@cindex IEEE754 format
Although floatingpoint representations vary from machine to machine,
@@ 18654,13 +18986,13 @@ IEEE 754 Standard. An IEEE754 format value has three
components:
@itemize @bullet
@item
a sign bit telling whether the number is positive or negative,
+A sign bit telling whether the number is positive or negative.
@item
an @dfn{exponent} giving its order of magnitude, @var{e},
+An @dfn{exponent} giving its order of magnitude, @var{e}.
@item
and a @dfn{significand}, @var{s},
+A @dfn{significand}, @var{s},
specifying the actual digits of the number.
@end itemize
@@ 18681,24 +19013,27 @@ Three of the standard IEEE754 types are 32bit
single precision,
The standard also specifies extended precision formats
to allow greater precisions and larger exponent ranges.
+The significand is stored in @dfn{normalized} format,
+which means that the first bit is always a one.
+
@node Floatingpoint Context
address@hidden Floatingpoint Context
address@hidden Floatingpoint Context
@cindex context, floatingpoint
A floatingpoint context defines the environment for arithmetic operations.
It governs precision, sets rules for rounding and limits range for exponents.
+A floatingpoint @dfn{context} defines the environment for arithmetic
operations.
+It governs precision, sets rules for rounding, and limits the range for
exponents.
The context has the following primary components:
address@hidden @code
address@hidden precision
address@hidden @dfn
address@hidden Precision
Precision of the floatingpoint format in bits.
@item emax
Maximum exponent allowed for this format.
@item emin
Minimum exponent allowed for this format.
address@hidden underflow behavior
address@hidden Underflow behavior
The format may or may not support gradual underflow.
address@hidden rounding
address@hidden Rounding
The rounding mode of this context.
@end table
@@ 18706,7 +19041,7 @@ The rounding mode of this context.
field values for the basic IEEE754 binary formats:
@float Table,tableieeeformats
address@hidden IEEE Formats}
address@hidden IEEE Format Context Values}
@multitable @columnfractions .20 .20 .20 .20 .20
@headitem Name @tab Total bits @tab Precision @tab emin @tab emax
@item Single @tab 32 @tab 24 @tab @minus{}126 @tab +127
@@ 18740,31 +19075,29 @@ support subnormal numbers.
@end quotation
@node Rounding Mode
address@hidden Floatingpoint Rounding Mode
address@hidden Floatingpoint Rounding Mode
@cindex rounding mode, floatingpoint
The @dfn{rounding mode} specifies the behavior for the results of numerical
operations when discarding extra precision. Each rounding mode indicates
how the least significant returned digit of a rounded result is to
be calculated.
The @code{ROUNDMODE} variable (@pxref{Setting Rounding Mode}) provides
program level control over the rounding mode.
@ref{tableroundingmodes} lists the IEEE754 defined
rounding modes:
@float Table,tableroundingmodes
address@hidden Modes}
address@hidden @columnfractions .45 .30 .25
address@hidden Rounding Mode @tab IEEE Name @tab @code{ROUNDMODE}
address@hidden Round to nearest, ties to even @tab @code{roundTiesToEven} @tab
@code{"N"} or @code{"n"}
address@hidden Round toward plus Infinity @tab @code{roundTowardPositive} @tab
@code{"U"} or @code{"u"}
address@hidden Round toward negative Infinity @tab @code{roundTowardNegative}
@tab @code{"D"} or @code{"d"}
address@hidden Round toward zero @tab @code{roundTowardZero} @tab @code{"Z"} or
@code{"z"}
address@hidden Round to nearest, ties away from zero @tab
@code{roundTiesToAway} @tab @code{"A"} or @code{"a"}
address@hidden 754 Rounding Modes}
address@hidden @columnfractions .45 .55
address@hidden Rounding Mode @tab IEEE Name
address@hidden Round to nearest, ties to even @tab @code{roundTiesToEven}
address@hidden Round toward plus Infinity @tab @code{roundTowardPositive}
address@hidden Round toward negative Infinity @tab @code{roundTowardNegative}
address@hidden Round toward zero @tab @code{roundTowardZero}
address@hidden Round to nearest, ties away from zero @tab @code{roundTiesToAway}
@end multitable
@end float
The default mode @samp{roundTiesToEven} is the most preferred,
+The default mode @code{roundTiesToEven} is the most preferred,
but the least intuitive. This method does the obvious thing for most values,
by rounding them up or down to the nearest digit.
For example, rounding 1.132 to two digits yields 1.13,
@@ 18790,10 +19123,10 @@ BEGIN @{
@end example
@noindent
produces the following output when address@hidden
+produces the following output when run:@footnote{It
is possible for the output to be completely different if the
C library in your system does not use the IEEE754 evenrounding
rule to round halfway cases for @code{printf()}.}:
+rule to round halfway cases for @code{printf()}.}
@example
3.5 => 4
@@ 18807,26 +19140,26 @@ rule to round halfway cases for @code{printf()}.}:
4.5 => 4
@end example
The theory behind the rounding mode @samp{roundTiesToEven} is that
+The theory behind the rounding mode @code{roundTiesToEven} is that
it more or less evenly distributes upward and downward rounds
of exact halves, which might cause the roundoff error
to cancel itself out. This is the default rounding mode used
in IEEE754 computing functions and operators.
The other rounding modes are rarely used.
Round toward positive infinity (@samp{roundTowardPositive})
and round toward negative infinity (@samp{roundTowardNegative})
+Round toward positive infinity (@code{roundTowardPositive})
+and round toward negative infinity (@code{roundTowardNegative})
are often used to implement interval arithmetic,
where you adjust the rounding mode to calculate upper and lower bounds
for the range of output. The @samp{roundTowardZero}
+for the range of output. The @code{roundTowardZero}
mode can be used for converting floatingpoint numbers to integers.
The rounding mode @samp{roundTiesToAway} rounds the result to the
+The rounding mode @code{roundTiesToAway} rounds the result to the
nearest number and selects the number with the larger magnitude
if a tie occurs.
Some numerical analysts will tell you that your choice of rounding style
has tremendous impact on the final outcome, and advise you to wait until
final output for any rounding. Instead, you can often achieve this goal by
+final output for any rounding. Instead, you can often avoid roundoff error
problems by
setting the precision initially to some value sufficiently larger than
the final desired precision, so that the accumulation of roundoff error
does not influence the outcome.
@@ 18835,6 +19168,48 @@ sensitive to accumulation of roundoff error,
one way to be sure is to look for a significant difference in output
when you change the rounding mode.
address@hidden Gawk and MPFR
address@hidden @command{gawk} + MPFR = Powerful Arithmetic
+
+The rest of this @value{CHAPTER} decsribes how to use the arbitrary precision
+(also known as @dfn{multiple precision} or @dfn{infinite precision}) numeric
+capabilites in @command{gawk} to produce maximally accurate results
+when you need it.
+
+But first you should check if your version of
address@hidden supports arbitrary precision arithmetic.
+The easiest way to find out is to look at the output of
+the following command:
+
address@hidden
+$ @kbd{gawk version}
address@hidden GNU Awk 4.1.0 (GNU MPFR 3.1.0, GNU MP 5.0.3)
address@hidden Copyright (C) 1989, 19912012 Free Software Foundation.
address@hidden
address@hidden example
+
address@hidden uses the
address@hidden://www.mpfr.org, GNU MPFR}
+and
address@hidden://gmplib.org, GNU MP} (GMP)
+libraries for arbitrary precision
+arithmetic on numbers. So if you do not see the names of these libraries
+in the output, then your version of @command{gawk} does not support
+arbitrary precision arithmetic.
+
+Additionally,
+there are a few elements available in the @code{PROCINFO} array
+to provide information about the MPFR and GMP libraries.
address@hidden, for more information.
+
address@hidden
+Even if you aren't interested in arbitrary precision arithmetic, you
+may still benefit from knowing about how @command{gawk} handles numbers
+in general, and the limitations of doing arithmetic with ordinary
address@hidden numbers.
address@hidden ignore
+
+
@node Arbitrary Precision Floats
@section Arbitrary Precision Floatingpoint Arithmetic with @command{gawk}
@@ 18854,10 +19229,10 @@ provide control over the working precision and the
rounding mode.
The precision and the rounding mode are set globally for every operation
to follow.
The default working precision for arbitrary precision floats is 53,
+The default working precision for arbitrary precision floatingpoint values is
53,
and the default value for @code{ROUNDMODE} is @code{"N"},
which selects the IEEE754
address@hidden (@pxref{Rounding Mode}) rounding address@hidden
address@hidden (@pxref{Rounding Mode}) rounding address@hidden
default precision is 53, since according to the MPFR documentation,
the library should be able to exactly reproduce all computations with
doubleprecision machine floatingpoint numbers (@code{double} type
@@ 18885,13 +19260,21 @@ gradual underflow (subnormal numbers).
@quotation NOTE
MPFR numbers are variablesize entities, consuming only as much space as
needed to store the significant digits. Since the performance using MPFR
numbers pales in comparison to doing math using the underlying machine
+numbers pales in comparison to doing arithmetic using the underlying machine
types, you should consider using only as much precision as needed by
your program.
@end quotation
address@hidden
+* Setting Precision:: Setting the working precision.
+* Setting Rounding Mode:: Setting the rounding mode.
+* Floatingpoint Constants:: Representing floatingpoint constants.
+* Changing Precision:: Changing the precision of a number.
+* Exact Arithmetic:: Exact arithmetic with floatingpoint numbers.
address@hidden menu
+
@node Setting Precision
address@hidden Setting the Working Precision
address@hidden Setting the Working Precision
@cindex @code{PREC} variable
@command{gawk} uses a global working precision; it does not keep track of
@@ 18956,21 +19339,36 @@ the numbers from your floatingpoint computations
with more than 15
significant digits in them.
Conversely, it takes a precision of 332 bits to hold an approximation
of constant @value{PI} that is accurate to 100 decimal places.
+of the constant @value{PI} that is accurate to 100 decimal places.
You should always add some extra bits in order to avoid the confusing roundoff
issues that occur because numbers are stored internally in binary.
@node Setting Rounding Mode
address@hidden Setting the Rounding Mode
address@hidden Setting the Rounding Mode
@cindex @code{ROUNDMODE} variable
The builtin variable @code{ROUNDMODE} has the default value @code{"N"},
which selects the IEEE754 rounding mode @samp{roundTiesToEven}.
The other possible values for @code{ROUNDMODE} are @code{"U"} for rounding mode
address@hidden, @code{"D"} for @samp{roundTowardNegative},
and @code{"Z"} for @samp{roundTowardZero}.
+The @code{ROUNDMODE} variable provides
+program level control over the rounding mode.
+The correspondance between @code{ROUNDMODE} and the IEEE
+rounding modes is shown in @ref{tablegawkroundingmodes}.
+
address@hidden Table,tablegawkroundingmodes
address@hidden@command{gawk} Rounding Modes}
address@hidden @columnfractions .45 .30 .25
address@hidden Rounding Mode @tab IEEE Name @tab @code{ROUNDMODE}
address@hidden Round to nearest, ties to even @tab @code{roundTiesToEven} @tab
@code{"N"} or @code{"n"}
address@hidden Round toward plus Infinity @tab @code{roundTowardPositive} @tab
@code{"U"} or @code{"u"}
address@hidden Round toward negative Infinity @tab @code{roundTowardNegative}
@tab @code{"D"} or @code{"d"}
address@hidden Round toward zero @tab @code{roundTowardZero} @tab @code{"Z"} or
@code{"z"}
address@hidden Round to nearest, ties away from zero @tab
@code{roundTiesToAway} @tab @code{"A"} or @code{"a"}
address@hidden multitable
address@hidden float
+
address@hidden has the default value @code{"N"},
+which selects the IEEE754 rounding mode @code{roundTiesToEven}.
+Besides the values listed in @ref{tablegawkroundingmodes},
@command{gawk} also accepts @code{"A"} to select the IEEE754 mode
address@hidden
address@hidden
if your version of the MPFR library supports it; otherwise setting
@code{ROUNDMODE} to this value has no effect. @xref{Rounding Mode},
for the meanings of the various rounding modes.
@@ 18984,7 +19382,7 @@ $ @kbd{gawk M vROUNDMODE="Z" 'BEGIN @{
printf("%.2f\n", 1.378) @}'}
@end example
@node Floatingpoint Constants
address@hidden Representing Floatingpoint Constants
address@hidden Representing Floatingpoint Constants
@cindex constants, floatingpoint
Be wary of floatingpoint constants! When reading a floatingpoint constant
@@ 18997,7 +19395,7 @@ not change the precision of a constant. If you need to
represent a floatingpoint constant at a higher precision than the
default and cannot use a command line assignment to @code{PREC},
you should either specify the constant as a string, or
a rational number whenever possible. The following example
+as a rational number whenever possible. The following example
illustrates the differences among various ways to
print a floatingpoint constant:
@@ 19015,7 +19413,7 @@ $ @kbd{gawk M 'BEGIN @{ PREC = 113; printf("%0.25f\n",
1/10) @}'}
In the first case, the number is stored with the default precision of 53.
@node Changing Precision
address@hidden Changing the Precision of a Number
address@hidden Changing the Precision of a Number
@cindex Laurie, Dirk
@quotation
@@ 19029,7 +19427,7 @@ Sometimes the first course is proper, sometimes the
second, and it takes
careful analysis to tell which.}
Dirk address@hidden Laurie.
address@hidden Arithmetic Considered Perilous  A Detective Story}.
address@hidden Arithmetic Considered Perilous  A Detective Story}.
Electronic Transactions on Numerical Analysis. Volume 28, pp. 168173, 2008.}
@end quotation
@@ 19054,7 +19452,7 @@ x += 0.0
@end example
@node Exact Arithmetic
address@hidden Exact Arithmetic with Floatingpoint Numbers
address@hidden Exact Arithmetic with Floatingpoint Numbers
@quotation CAUTION
Never depend on the exactness of floatingpoint arithmetic,
@@ 19109,80 +19507,28 @@ In applications where 15 or fewer decimal places
suffice,
hardware double precision arithmetic can be adequate, and is usually much
faster.
But you do need to keep in mind that every floatingpoint operation
can suffer a new rounding error with catastrophic consequences as illustrated
by our attempt to compute the value of the constant @value{PI},
+by our attempt to compute the value of the constant @value{PI}
(@pxref{Floatingpoint Programming}).
Extra precision can greatly enhance the stability and the accuracy
of your computation in such cases.
Repeated addition is not necessarily equivalent to multiplication
in floatingpoint arithmetic. In the last example
(@pxref{Floatingpoint Programming}),
you may or may not succeed in getting the correct result by choosing
an arbitrarily large value for @code{PREC}. Reformulation of
the problem at hand is often the correct approach in such situations.


address@hidden Integer Programming
address@hidden Effective Integer Programming

As has been mentioned already, @command{gawk} ordinarily uses hardware double
precision with 64bit IEEE binary floatingpoint representation
for numbers on most systems. A large integer like 9007199254740997
has a binary representation that, although finite, is more than 53 bits long;
it must also be rounded to 53 bits.
The biggest integer that can be stored in a C @code{double} is usually the same
as the largest possible value of a @code{double}. If your system @code{double}
is an IEEE 64bit @code{double}, this largest possible value is an integer and
can be represented precisely. What more should one know about integers?

If you want to know what is the largest integer, such that it and
all smaller integers can be stored in 64bit doubles without losing precision,
then the answer is
address@hidden
address@hidden
address@hidden iftex
address@hidden
2^53.
address@hidden ifnottex
The next representable number is the even number
address@hidden
address@hidden + 2},
address@hidden iftex
address@hidden
2^53 + 2,
address@hidden ifnottex
meaning it is unlikely that you will be able to make
address@hidden print
address@hidden
address@hidden + 1}
address@hidden iftex
address@hidden
2^53 + 1
address@hidden ifnottex
in integer format.
The range of integers exactly representable by a 64bit double
is
address@hidden
address@hidden, 2^{53}]}.
address@hidden iftex
address@hidden
address@hidden, 2^53].
address@hidden ifnottex
If you ever see an integer outside this range in @command{gawk}
using 64bit doubles, you have reason to be very suspicious about
the accuracy of the output. Here is a simple program with erroneous output:
+in floatingpoint arithmetic. In the example in
address@hidden Programming}:
@example
$ @kbd{gawk 'BEGIN @{ i = 2^53  1; for (j = 0; j < 4; j++) print i + j @}'}
address@hidden 9007199254740991
address@hidden 9007199254740992
address@hidden 9007199254740992
address@hidden 9007199254740994
+$ @kbd{gawk 'BEGIN @{}
+> @kbd{for (d = 1.1; d <= 1.5; d += 0.1)}
+> @kbd{i++}
+> @kbd{print i}
+> @address@hidden'}
address@hidden 4
@end example
The lesson is to not assume that any large integer printed by @command{gawk}
represents an exact result from your computation, especially if it wraps
around on your screen.
address@hidden
+you may or may not succeed in getting the correct result by choosing
+an arbitrarily large value for @code{PREC}. Reformulation of
+the problem at hand is often the correct approach in such situations.
@node Arbitrary Precision Integers
@section Arbitrary Precision Integer Arithmetic with @command{gawk}
@@ 19227,12 +19573,14 @@ would be @math{3.322 @cdot 183231},
would be 3.322 x 183231,
@end ifnottex
or 608693.
+(Thus, the floatingpoint representation requires over 30 times as
+many decimal digits!)
The result from an arithmetic operation with an integer and a floatingpoint
value
is a floatingpoint value with a precision equal to the working precision.
The following program calculates the eighth term in
Sylvester's address@hidden, Eric W.
address@hidden's Sequence}. From MathWorldA Wolfram Web Resource.
address@hidden's Sequence}. From MathWorldA Wolfram Web Resource.
@url{http://mathworld.wolfram.com/SylvestersSequence.html}}
using a recurrence:
@@ 19250,7 +19598,7 @@ The output differs from the acutal number,
113423713055421844361000443,
because the default precision of 53 is not enough to represent the
floatingpoint results exactly. You can either increase the precision
(100 is enough in this case), or replace the floatingpoint constant
address@hidden with an integer, to perform all computations using integer
address@hidden with an integer, to perform all computations using integer
arithmetic to get the correct output.
It will sometimes be necessary for @command{gawk} to implicitly convert an
@@ 19267,28 +19615,20 @@ like this:
gawk M 'BEGIN @{ n = 13; print (n + 0.0) % 2.0 @}'
@end example
You can avoid this issue altogether by specifying the number as a float
+You can avoid this issue altogether by specifying the number as a
floatingpoint value
to begin with:
@example
gawk M 'BEGIN @{ n = 13.0; print n % 2.0 @}'
@end example
Note that for the particular example above, there is unlikely to be a
reason for simply not using the following:
+Note that for the particular example above, there is likely best
+to just use the following:
@example
gawk M 'BEGIN @{ n = 13; print n % 2 @}'
@end example

address@hidden MPFR and GMP Libraries
address@hidden Information About the MPFR and GMP Libraries

There are a few elements available in the @code{PROCINFO} array
to provide information about the MPFR and GMP libraries.
address@hidden, for more information.

@node Advanced Features
@chapter Advanced Features of @command{gawk}
@cindex advanced features, network connections, See Also networks, connections
@@ 30191,7 +30531,7 @@ When @option{sandbox} is specified, extensions are
disabled
@menu
* Internals:: A brief look at some @command{gawk} internals.
* Plugin License:: A note about licensing.
* Loading Extensions:: How to load dynamic extensions.
+* Loading Extensions:: How to load dynamic extensions.
* Sample Library:: A example of new functions.
@end menu
@@ 31115,7 +31455,6 @@ other introductory texts that you should refer to
instead.)
@menu
* Basic High Level:: The high level view.
* Basic Data Typing:: A very quick intro to data types.
* Floating Point Issues:: Stuff to know about floatingpoint numbers.
@end menu
@node Basic High Level
@@ 31244,69 +31583,32 @@ and easier to read.
@appendixsec Data Values in a Computer
@cindex variables
In a program,
you keep track of information and values in things called @dfn{variables}.
A variable is just a name for a given value, such as @code{first_name},
address@hidden, @code{address}, and so on.
address@hidden has several predefined variables, and it has
special names to refer to the current input record
and the fields of the record.
You may also group multiple
associated values under one name, as an array.

address@hidden values, numeric
address@hidden values, string
address@hidden scalar values
Data, particularly in @command{awk}, consists of either numeric
values, such as 42 or 3.1415927, or string values.
String values are essentially anything that's not a number, such as a name.
Strings are sometimes referred to as @dfn{character data}, since they
store the individual characters that comprise them.
Individual variables, as well as numeric and string variables, are
referred to as @dfn{scalar} values.
Groups of values, such as arrays, are not scalars.

address@hidden integers
address@hidden floatingpoint, numbers
address@hidden numbers, floatingpoint
Within computers, there are two kinds of numeric values: @dfn{integers}
and @dfn{floatingpoint}.
In school, integer values were referred to as ``whole'' numbersthat is,
numbers without any fractional part, such as 1, 42, or @minus{}17.
The advantage to integer numbers is that they represent values exactly.
The disadvantage is that their range is limited. On most systems,
this range is @minus{}2,147,483,648 to 2,147,483,647.
However, many systems now support a range from
address@hidden,223,372,036,854,775,808 to 9,223,372,036,854,775,807.

address@hidden unsigned integers
address@hidden integers, unsigned
Integer values come in two flavors: @dfn{signed} and @dfn{unsigned}.
Signed values may be negative or positive, with the range of values just
described.
Unsigned values are always positive. On most systems,
the range is from 0 to 4,294,967,295.
However, many systems now support a range from
0 to 18,446,744,073,709,551,615.

address@hidden double precision floatingpoint
address@hidden single precision floatingpoint
Floatingpoint numbers represent what are called ``real'' numbers; i.e.,
those that do have a fractional part, such as 3.1415927.
The advantage to floatingpoint numbers is that they
can represent a much larger range of values.
The disadvantage is that there are numbers that they cannot represent
exactly.
address@hidden uses @dfn{double precision} floatingpoint numbers, which
can hold more digits than @dfn{single precision}
floatingpoint numbers.
Floatingpoint issues are discussed more fully in
address@hidden Point Issues}.
+In a program,
+you keep track of information and values in things called @dfn{variables}.
+A variable is just a name for a given value, such as @code{first_name},
address@hidden, @code{address}, and so on.
address@hidden has several predefined variables, and it has
+special names to refer to the current input record
+and the fields of the record.
+You may also group multiple
+associated values under one name, as an array.
At the very lowest level, computers store values as groups of binary digits,
or @dfn{bits}. Modern computers group bits into groups of eight, called
@dfn{bytes}.
Advanced applications sometimes have to manipulate bits directly,
and @command{gawk} provides functions for doing so.
address@hidden values, numeric
address@hidden values, string
address@hidden scalar values
+Data, particularly in @command{awk}, consists of either numeric
+values, such as 42 or 3.1415927, or string values.
+String values are essentially anything that's not a number, such as a name.
+Strings are sometimes referred to as @dfn{character data}, since they
+store the individual characters that comprise them.
+Individual variables, as well as numeric and string variables, are
+referred to as @dfn{scalar} values.
+Groups of values, such as arrays, are not scalars.
+
address@hidden Arithmetic}, provided a basic introduction to numeric
+types (integer and floatingpoint) and how they are used in a computer.
+Please review that information, including a number of caveats that
+were presented.
@cindex null strings
While you are probably used to the idea of a number without a value (i.e.,
zero),
@@ 31330,6 +31632,11 @@ plus 0 times 1, or decimal 10.
Octal and hexadecimal are discussed more in
@ref{Nondecimalnumbers}.
+At the very lowest level, computers store values as groups of binary digits,
+or @dfn{bits}. Modern computers group bits into groups of eight, called
@dfn{bytes}.
+Advanced applications sometimes have to manipulate bits directly,
+and @command{gawk} provides functions for doing so.
+
Programs are written in programming languages.
Hundreds, if not thousands, of programming languages exist.
One of the most popular is the C programming language.
@@ 31349,239 +31656,6 @@ standard for C. This standard became an ISO standard
in 1990.
In 1999, a revised ISO C standard was approved and released.
Where it makes sense, POSIX @command{awk} is compatible with 1999 ISO C.
address@hidden Floating Point Issues
address@hidden FloatingPoint Number Caveats

As mentioned earlier, floatingpoint numbers represent what are called
``real'' numbers, i.e., those that have a fractional part. @command{awk}
uses double precision floatingpoint numbers to represent all
numeric values. This @value{SECTION} describes some of the issues
involved in using floatingpoint numbers.

There is a very nice
address@hidden://www.validlab.com/goldberg/paper.pdf, paper on floatingpoint
arithmetic}
by David Goldberg,
``What Every Computer Scientist Should Know About Floatingpoint Arithmetic,''
address@hidden Computing Surveys} @strong{23}, 1 (199103), 548.
This is worth reading if you are interested in the details,
but it does require a background in computer science.

address@hidden
* String Conversion Precision:: The String Value Can Lie.
* Unexpected Results:: Floating Point Numbers Are Not Abstract
 Numbers.
* POSIX Floating Point Problems:: Standards Versus Existing Practice.
address@hidden menu

address@hidden String Conversion Precision
address@hidden The String Value Can Lie

Internally, @command{awk} keeps both the numeric value
(double precision floatingpoint) and the string value for a variable.
Separately, @command{awk} keeps
track of what type the variable has
(@pxref{Typing and Comparison}),
which plays a role in how variables are used in comparisons.

It is important to note that the string value for a number may not
reflect the full value (all the digits) that the numeric value
actually contains.
The following program (@file{values.awk}) illustrates this:

address@hidden
address@hidden
 sum = $1 + $2
 # see it for what it is
 printf("sum = %.12g\n", sum)
 # use CONVFMT
 a = "<" sum ">"
 print "a =", a
 # use OFMT
 print "sum =", sum
address@hidden
address@hidden example

address@hidden
This program shows the full value of the sum of @code{$1} and @code{$2}
using @code{printf}, and then prints the string values obtained
from both automatic conversion (via @code{CONVFMT}) and
from printing (via @code{OFMT}).

Here is what happens when the program is run:

address@hidden
$ @kbd{echo 3.654321 1.2345678  awk f values.awk}
address@hidden sum = 4.8888888
address@hidden a = <4.88889>
address@hidden sum = 4.88889
address@hidden example

This makes it clear that the full numeric value is different from
what the default string representations show.

address@hidden's default value is @code{"%.6g"}, which yields a value with
at least six significant digits. For some applications, you might want to
change it to specify more precision.
On most modern machines, most of the time,
17 digits is enough to capture a floatingpoint number's
value address@hidden cases can require up to
752 digits (!), but we doubt that you need to worry about this.}

address@hidden Unexpected Results
address@hidden Floating Point Numbers Are Not Abstract Numbers

address@hidden floatingpoint, numbers
Unlike numbers in the abstract sense (such as what you studied in high school
or college math), numbers stored in computers are limited in certain ways.
They cannot represent an infinite number of digits, nor can they always
represent things exactly.
In particular,
floatingpoint numbers cannot
always represent values exactly. Here is an example:

address@hidden
$ @kbd{awk '@{ printf("%010d\n", $1 * 100) @}'}
515.79
address@hidden 0000051579
515.80
address@hidden 0000051579
515.81
address@hidden 0000051580
515.82
address@hidden 0000051582
address@hidden@value{CTL}d}
address@hidden example

address@hidden
This shows that some values can be represented exactly,
whereas others are only approximated. This is not a ``bug''
in @command{awk}, but simply an artifact of how computers
represent numbers.

address@hidden negative zero
address@hidden positive zero
address@hidden address@hidden negative vs.@: positive
Another peculiarity of floatingpoint numbers on modern systems
is that they often have more than one representation for the number zero!
In particular, it is possible to represent ``minus zero'' as well as
regular, or ``positive'' zero.

This example shows that negative and positive zero are distinct values
when stored internally, but that they are in fact equal to each other,
as well as to ``regular'' zero:

address@hidden
$ @kbd{gawk 'BEGIN @{ mz = 0 ; pz = 0}
> @kbd{printf "0 = %g, +0 = %g, (0 == +0) > %d\n", mz, pz, mz == pz}
> @kbd{printf "mz == 0 > %d, pz == 0 > %d\n", mz == 0, pz == 0}
> @address@hidden'}
address@hidden 0 = 0, +0 = 0, (0 == +0) > 1
address@hidden mz == 0 > 1, pz == 0 > 1
address@hidden example

It helps to keep this in mind should you process numeric data
that contains negative zero values; the fact that the zero is negative
is noted and can affect comparisons.

address@hidden POSIX Floating Point Problems
address@hidden Standards Versus Existing Practice

Historically, @command{awk} has converted any nonnumeric looking string
to the numeric value zero, when required. Furthermore, the original
definition of the language and the original POSIX standards specified that
address@hidden only understands decimal numbers (base 10), and not octal
(base 8) or hexadecimal numbers (base 16).

Changes in the language of the
2001 and 2004 POSIX standard can be interpreted to imply that @command{awk}
should support additional features. These features are:

address@hidden @bullet
address@hidden
Interpretation of floating point data values specified in hexadecimal
notation (@samp{0xDEADBEEF}). (Note: data values, @emph{not}
source code constants.)

address@hidden
Support for the special IEEE 754 floating point values ``Not A Number''
(NaN), positive Infinity (``inf'') and negative Infinity (address@hidden'').
In particular, the format for these values is as specified by the ISO 1999
C standard, which ignores case and can allow machinedependent additional
characters after the @samp{nan} and allow either @samp{inf} or @samp{infinity}.
address@hidden itemize

The first problem is that both of these are clear changes to historical
practice:

address@hidden @bullet
address@hidden
The @command{gawk} maintainer feels that supporting hexadecimal floating
point values, in particular, is ugly, and was never intended by the
original designers to be part of the language.

address@hidden
Allowing completely alphabetic strings to have valid numeric
values is also a very severe departure from historical practice.
address@hidden itemize

The second problem is that the @code{gawk} maintainer feels that this
interpretation of the standard, which requires a certain amount of
``language lawyering'' to arrive at in the first place, was not even
intended by the standard developers. In other words, ``we see how you
got where you are, but we don't think that that's where you want to be.''

The 2008 POSIX standard added explicit wording to allow, but not require,
that @command{awk} support hexadecimal floating point values and
special values for ``Not A Number'' and infinity.

Although the @command{gawk} maintainer continues to feel that
providing those features is inadvisable,
nevertheless, on systems that support IEEE floating point, it seems
reasonable to provide @emph{some} way to support NaN and Infinity values.
The solution implemented in @command{gawk} is as follows:

address@hidden @bullet
address@hidden
With the @option{posix} commandline option, @command{gawk} becomes
``hands off.'' String values are passed directly to the system library's
address@hidden()} function, and if it successfully returns a numeric value,
that is what's address@hidden asked for it, you got it.}
By definition, the results are not portable across
different systems. They are also a little surprising:

address@hidden
$ @kbd{echo nanny  gawk posix '@{ print $1 + 0 @}'}
address@hidden nan
$ @kbd{echo 0xDeadBeef  gawk posix '@{ print $1 + 0 @}'}
address@hidden 3735928559
address@hidden example

address@hidden
Without @option{posix}, @command{gawk} interprets the four strings
address@hidden,
address@hidden,
address@hidden,
and
address@hidden
specially, producing the corresponding special numeric values.
The leading sign acts a signal to @command{gawk} (and the user)
that the value is really numeric. Hexadecimal floating point is
not supported (unless you also use @option{nondecimaldata},
which is @emph{not} recommended). For example:

address@hidden
$ @kbd{echo nanny  gawk '@{ print $1 + 0 @}'}
address@hidden 0
$ @kbd{echo +nan  gawk '@{ print $1 + 0 @}'}
address@hidden nan
$ @kbd{echo 0xDeadBeef  gawk '@{ print $1 + 0 @}'}
address@hidden 0
address@hidden example

address@hidden does ignore case in the four special values.
Thus @samp{+nan} and @samp{+NaN} are the same.
address@hidden itemize

@c ENDOFRANGE procon
@node Glossary

Summary of changes:
doc/ChangeLog  6 +
doc/gawk.info  2081 +++++++++++++++++++++++++++++
doc/gawk.texi  1268 +++++++++++++++++++
3 files changed, 1753 insertions(+), 1602 deletions()
hooks/postreceive

gawk
[Prev in Thread] 
Current Thread 
[Next in Thread] 
 [gawkdiffs] [SCM] gawk branch, master, updated. 63aeb055437534122ddb774b7eecc261ab6e592a,
Arnold Robbins <=