[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Data types (was: Re: Access the neighbors of an element)

From: Paul Kienzle
Subject: Re: Data types (was: Re: Access the neighbors of an element)
Date: Tue, 15 Feb 2005 09:00:52 -0500

On Feb 15, 2005, at 7:52 AM, John W. Eaton wrote:

On 15-Feb-2005, Paul Kienzle <address@hidden> wrote:

| The speedup on Intel is so big:
| that the slight slowdown on MIPS is a minor sacrifice:
| which is outweighed by the gains in other operations:

OK.  Do you think this is likely to be true on all systems, or should
be implement a check?

Short answer is I don't know.

I can imagine a system in which there is a very fast FPU (e.g.,
on the graphics card) but the processor itself is slow.  We would
need to do a lot more work to benefit from this though.  Similarly
for machines with a vector parallel FPU (such as Intel's MMX/SSE

The easiest thing is for us to implement what works best for most
(Intel and MIPS), and for the people who care about the speed
of integer operations to submit patches for the fastest most
maintainable code they can which supports the architectures they
want to work with.  If you want to get fancy, do some speed tests
at build time like ATLAS does and choose the code which runs the
fastest.  I don't think we need a runtime test.

  If it might be something that would be
different for specific CPUs even if they support the same instruction
set  (i.e., not all x86 compatible hardware shows the same results)
then I think the check should be done at run time rather than at
build time.

For multiplication where we need to check the top 4 bytes of the value
on int64 operations, it is faster to shift the value in the register
than casting to a pointer to 2 32-bit values which can be loaded
separately.  So the new 64 bit AMD and Intel extensions will need
different code than the older pure 32 bit processors, but this test
can be done at build time.

| Also, the LONGLONG case should be changed regardless since
| casting to a double loses precision.

Currently we don't do 64-bit ops because of this (Matlab doesn't have
them either) but it would be nice to have them if we can.


reply via email to

[Prev in Thread] Current Thread [Next in Thread]