octave-maintainers
[Top][All Lists]

## Re: mixed type operations in Octave

 From: John W. Eaton Subject: Re: mixed type operations in Octave Date: Tue, 09 Sep 2008 16:38:14 -0400

```On  9-Sep-2008, Jaroslav Hajek wrote:

| Addition and subtraction can be done using just the native operations.
| For instance, addition can be done like this:
|
| T add (T x, T y)
| {
|   T u;
|    if (x >= 0)
|      {
|         u = std::numeric_limits<T>::max () - x;
|         if (y > u)
|            ftruncate = true;
|         else
|            u = y;
|       }
|     else
|      {
|         u = std::numeric_limits<T>::min () - x;
|         if (y < u)
|            ftruncate = true;
|         else
|            u = y;
|       }
|     return x + u;
| }

Where is ftruncate declared and how is it used?

Maybe you would also want to check x == 0 and y == 0?  Something like

template <typename T>
T
add (const T& x, const T& y)
{
if (x == 0)
return y;
else if (y == 0)
return x;
else if (x > 0)
{
static T mx = std::numeric_limits<T>::max ();
return (y > mx - x) ? mx : x + y;
}
else
{
static T mn = std::numeric_limits<T>::min ();
return (y < mn - x) ? mn : x + y;
}
}

?

| or like this (uses the bit properties of two's complement signed int
| representation,
| is significantly faster)
|
| T add (T x, T y)
| {
|       T u = x + y;
|       T ux = u ^ x, uy = u ^ y;
|       if ((ux & uy) < 0)
|         {
|           u = std::numeric_limits<T>::max () + signbit (~u);
|           ftruncate = true;
|         }
|       return u;
| }

I'd think that which version is faster would depend on the compiler
and hardware, at least to some degree.  And I'd really rather not use
bit twiddling tricks that are not guaranteed to work for all
arithmetic models unless they are accompanied by some configure checks
so that this code won't cause trouble in the future if/when someone
tries to port Octave to some exotic system.

| Both versions seem to be faster than going via double, the current
| implementation.
| While version 1 is million percent portable, version 2 may not be if
| the target machine does not use two's complement signed integers (i.e.
| the signed arithmetic is identical to unsigned). I don't know if there
| are any such architectures we actually care of that do not support
| this. In fact, even the autoconf manual says that assuming two's
| complement is, for practical purposes, safe.
| OTOH, on most, if not all, architectures, version 2 will work and is
| still considerably faster than version 1.
|
| Multiplication is best done by promoting to a wider integer type,
| multiplying, and fit-to-range. Again, avoiding the int-real
| conversions is a performance win. If 128-bit int is not available,
| 64-bit multiplication can be done by using bit shifts.

Before making claims about what is faster, I guess I'd like to see
some actual numbers for the different versions on several different
architectures.

BTW, how much faster are we talking about here?

Although speed is nice, I think it would be better to spend time
working on missing features first.

jwe

```