bug-coreutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#6020: coreutils-8.x: a simple feature enhancement, and how to do it


From: Nelson H. F. Beebe
Subject: bug#6020: coreutils-8.x: a simple feature enhancement, and how to do it
Date: Fri, 23 Apr 2010 19:30:17 -0600 (MDT)

In 1981, 29 years ago, Intel introduced the 8087 floating-point
coprocessor that implemented an early draft of the 1985 IEEE 754
standard for binary floating-point arithmetic.  That chip, and all
subsequent Intel IA-32, IA-64, EM64T, and AMD AMD64 (aka x86_64)
architectures provide three floating-point formats in hardware:

        32-bit  24-bit significand, number range ~= 1.4e-45 .. 3.40e38,
                roughly 7 decimal digits
                C type float

        64-bit  53-bit significand, number range ~= 4.94e-324 .. 1.80e308
                roughly 16 decimal digits
                C type double

        80-bit  (variously stored in 10, 12, or 16-byte memory blocks)
                64-bit significand, number range ~= 3.64e-4951 .. 1.19e+4932
                roughly 19 decimal digits
                C type long double

Several other CPU platforms provide a 128-bit format instead of the
80-bit format, with these properties:

        128-bit 113-bit significand, number range ~= 3.64e-4951 .. 1.19e+4932,
                roughly 34 decimal digits
                C type long double

In 2009, the IEEE 754 Standard was revised to include the above, plus
decimal arithmetic, the latter with these properties:

        32-bit  7 digits, number range 1e-101 .. 9.999_999e+96

        64-bit  16 digits, number range 1e-398 .. 9.999_999_999_999_999e+384

        128-bit 34 digits, number range 1e-6176 .. 
9.999_999_999_999_999_999_999_999_999_999_999e+6144

At present, up to version 8.5, coreutils uses only type double in its
implementation of the -g sort-ordering option.  The result is that it
is unable to correctly sort files that use the entire number range of
IEEE 754 binary arithmetic; indeed, the double format covers only
about 6% of the possible binary range, and 5% of the decimal range.

Please extend the next version of coreutils to use "long double"
instead of "double" in this operation.  Here is a patch that worked
for one recent coreutils release:

*** src/sort.c.~1~      Sun Jan  3 10:06:20 2010
--- src/sort.c  Mon Jan 18 08:24:18 2010
***************
*** 1792,1799 ****
--- 1792,1805 ----

    char *ea;
    char *eb;
+
+ #if 0
    double a = strtod (sa, &ea);
    double b = strtod (sb, &eb);
+ #else
+   long double a = strtold (sa, &ea);
+   long double b = strtold (sb, &eb);
+ #endif

    /* Put conversion errors at the start of the collating sequence.  */
    if (sa == ea)

The "long double" type is required by both C89 and C99, but the
strtold() function appeared first in C99 (although many vendors
supplied it before then).  If strtold() is absent, then
"long double x; if (sscanf(s, "%Lg", &x) == 1) {...}" is often
a reasonable replacement.

However, note that some aberrant systems implement "long double" as
"double" (e.g., DEC Alpha OSF/1 4.x, Minix, and most *BSD
distributions), and some implement it in doubled-double format, which
increases the precision, but leaves the range at that of double.
Examples of the latter include Apple Mac OS X on PowerPC, IBM AIX on
PowerPC, and SGI IRIX MIPS.

I suggest a configure-time check for strtold(), and if that works,
then use "long double" in sort.c.

-------------------------------------------------------------------------------
- Nelson H. F. Beebe                    Tel: +1 801 581 5254                  -
- University of Utah                    FAX: +1 801 581 4148                  -
- Department of Mathematics, 110 LCB    Internet e-mail: address@hidden  -
- 155 S 1400 E RM 233                       address@hidden  address@hidden -
- Salt Lake City, UT 84112-0090, USA    URL: http://www.math.utah.edu/~beebe/ -
-------------------------------------------------------------------------------







reply via email to

[Prev in Thread] Current Thread [Next in Thread]