discuss-gnuradio
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Discuss-gnuradio] floating point dot product benchmark results


From: Eric Blossom
Subject: [Discuss-gnuradio] floating point dot product benchmark results
Date: Thu, 18 Apr 2002 20:39:28 -0700
User-agent: Mutt/1.2.5i

Here are some results re floating point performance computing dot
products (FIR filters) between the Athlon and the P4.  If anyone knows
of faster code please let me know.

Also, it take quite a bit of hair to get the SSE code to fly: careful
scheduling, 4 sets of taps so that regardless of the input alignment
the code can always do 128-bit aligned loads, etc.

All the code is in the CVS repository.  
Look under gnuradio/src/gnu/lib/gr/*dotprod*

Eric

----------------------------------------------------------------

Single precision floating point dot product benchmark results
The test is floating point input, floating point output, floating point
taps.  The tests are run with 256 taps, over 40e6 input samples.


4-18-2002

Athlon MP 1800+ (1.5 GHz) running uniprocessor:

    description         giga taps/sec   cycles/tap
    ===========         ============    ============
    unrolled C          0.847           1.77
    SSE simple          1.01            1.48
    SSE unrolled        1.07            1.40
    3DNow! simple       1.25            1.20
    3DNow! unrolled     1.4             1.07


Pentium 4 (1.7 GHz):

    description         giga taps/sec   cycles/tap
    ===========         ============    ============
    unrolled C          0.631           2.7
    SSE simple          1.28            1.32
    SSE unrolled        1.7             1.0


The giga taps/sec column measures absolute performance.  Big is
better.

Cycles/tap is the processor clock speed divided by taps/sec.  
It is a normalized figure of merit.  Small is better.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]