guile-devel
[Top][All Lists]

## Immediate doubles (up to 2^256) and rationals coming to Guile 3

 From: Mark H Weaver Subject: Immediate doubles (up to 2^256) and rationals coming to Guile 3 Date: Thu, 06 Jun 2019 05:40:39 -0400

```I've found a way to efficiently support both immediate IEEE binary-64
doubles up to ~1.158e77 (with larger ones transparently allocated on the
heap), and also immediate exact rationals with up to 54 binary digits
(~16 decimal digits), without restricting the 64-bit pointer space at
all, and without any loss of arithmetic precision.

I have a working draft implementation that roughly doubles the speed of
a simple "substract 1.0 until negative" loop for inexact reals less than
2^256, compared with current 'master' (near 2.9.2).  The same loop for
exact rationals runs in ~70% of the time compared with current master.
More importantly, no heap allocation is required when working with these
immediate values.

More precisely, large finite doubles with magnitude >= 2^256
(1.157920892373162e77) must be heap allocated.  All other doubles,
including all Inf/NaNs, are immediate floats, or "iflos".

I call this approach "high-exponent boxing", because instead of using
the space of ~2^53 NaNs for non-flonum purposes, I use the space of
finite doubles with binary exponent larger than 255.  That's 3/8ths of
the total space of IEEE doubles.  This gives us a total of (3 * 2^61)
SCM values to use for other things.

To convert a double into an SCM, just add (2^60 + 2^52) to its unsigned
64-bit representation, and rotate left 4 bits.  This maps the largest
finite doubles to SCM values with 000, 110, or 111 in the low 3 bits.

* * *

I've also implemented immediate exact rationals, or "fixrats".  The
precise rule is that on a 64-bit system, an exact non-integer rational
is immediate if and only if it can be written with 54 binary digits or
less.  In terms of decimal digits, any rational that can be written with
16 decimal digits is immediate, and some that require 17 or 18 decimal
digits are immediate.

Here's the format of fixrats on 64-bit systems:

Srrrrrrxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx0111
|\____/\___________________________________________________/\__/
|  |                          |                              |
| rank (6 bits)             data (53 bits)                  tag
sign

The 53-bit data field contains both the numerator and denominator, with
the denominator in the highest bits.  The most significant bit of the
denominator is implicit, i.e. not explicitly represented.  That's why we
can represent up to 54 binary digits in a 53-bit field.  The remaining
bits of the denominator occupy the most significant data bits, and the
magnitude of the numerator occupy the least significant data bits.

The 6-bit rank is 2 less than the integer length of the denominator,
i.e. 1 less than the number of data bits allocated to the denominator.

I chose this representation because it allows us to leverage existing
floating-point hardware to efficiently pack fixrats.  Simply convert the
denominator to an IEEE double, and now the rank will be in the low bits
of the IEEE exponent field, immediately adjacent to the denominator with
its high bit removed.  This simplifies the packing operation.

There's also a nice way to extract the denominator from a fixrat: mask
out the sign bit, shift right 5 bits, and interpret it as an IEEE
double.  The denominator will be the integer part of the resulting
value, with the numerator in the fraction bits.  Simply cast this double
to an integer to discard the numerator bits.

* * *

Here are the tags used in my draft implementation:

;; /* with iflos:   xxx:  iflo (000 < xxx < 110)
;;    (64-bit)     1111:  fixnum
;;                 0111:  fixrat
;;
;;                  000:  heap object
;;             tttt0110:  immediate non-number
;;                 1110:  [NOT_SCM]
;;                11110:  [NOT_SCM] struct tag
;;          ttttt101110:  [NOT_SCM] non-pair non-struct non-smob tag
;;     ttttttttx1001110:  [NOT_SCM] smob

and here's my plan for 32-bit systems, not yet implemented:

;; without iflos:     1:  fixnum
;;    (32-bit)       10:  fixrat
;;
;;                  000:  heap object
;;             tttt0100:  immediate non-number
;;                 1100:  [NOT_SCM]
;;                11100:  [NOT_SCM] struct tag
;;          ttttt101100:  [NOT_SCM] non-pair non-struct non-smob tag
;;     ttttttttx1001100:  [NOT_SCM] smob                              */

I'll have more to say about all of this in future messages, and of
course I welcome your comments and suggestions.  For now, I've pushed my
preliminary work to the 'wip-new-tagging' branch on Savannah.

https://git.savannah.gnu.org/cgit/guile.git/log/?h=wip-new-tagging

Give it a try and let me know what you think.

Mark

```