[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

IEEE floating point support for guile?

From: Chris Hanson
Subject: IEEE floating point support for guile?
Date: Wed, 08 Nov 2000 14:45:20 -0500
User-agent: IMAIL/1.6; Edwin/3.103; MIT-Scheme/7.5.10

   Date: 08 Nov 2000 11:28:30 -0500
   From: Jim Blandy <address@hidden>

   How do people actually writing heavily numeric code feel about this
   extra precision?  If Guile insists on putting the FPU in double
   precision (64 bit) mode, will that annoy people who try to use Guile
   in numeric applications?

Let me inject an opinion from two people here at MIT who do a lot of
numerical computation.  (I've CC'd them to the message so they can
complain if I misrepresent their position.)

Jack Wisdom and Gerry Sussman are currently teaching a class on
computational classical mechanics, which involves the use of Scheme
code to solve mechanics problems.  They have also done extensive
simulations of the solar system (not in Scheme) to understand its

Both Jack and Gerry are much more interested in having repeatable
results and well-defined error bounds than in having extra precision.
The primary issue in numeric computation is _not_ precision (at least
if you use 64-bit IEEE), but error control.  And error control depends
critically upon deep understanding of the precise behavior of the
floating-point computations.  The IEEE standard was important
precisely because it carefully defined the entire floating-point
instruction set, which made it possible for numerical analysts to
design to this one specification and have their code run properly on
any conforming implementation.  Taking the same code and running it on
a non-conforming implementation, even one with extra precision, can
actually produce results that are LESS ACCURATE, because the
operations will behave differently than expected and therefore produce
different kinds of errors.  The errors from different parts of the
computation are supposed to fall into known ranges, and then be
combined in known ways, to produce an aggregate error.

This is the essence of the argument in favor of using a standard
floating-point implementation.  It's also the reason why the rounding
mode should always be set to "round-to-nearest": because that is the
standard way to do it, which provides the correct known error

It's a real pity that almost no one understands this, _including_ many
of the designers of floating-point hardware.

reply via email to

[Prev in Thread] Current Thread [Next in Thread]