[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Discuss-gnuradio] floating point dot product benchmark results
From: |
Eric Blossom |
Subject: |
[Discuss-gnuradio] floating point dot product benchmark results |
Date: |
Thu, 18 Apr 2002 20:39:28 -0700 |
User-agent: |
Mutt/1.2.5i |
Here are some results re floating point performance computing dot
products (FIR filters) between the Athlon and the P4. If anyone knows
of faster code please let me know.
Also, it take quite a bit of hair to get the SSE code to fly: careful
scheduling, 4 sets of taps so that regardless of the input alignment
the code can always do 128-bit aligned loads, etc.
All the code is in the CVS repository.
Look under gnuradio/src/gnu/lib/gr/*dotprod*
Eric
----------------------------------------------------------------
Single precision floating point dot product benchmark results
The test is floating point input, floating point output, floating point
taps. The tests are run with 256 taps, over 40e6 input samples.
4-18-2002
Athlon MP 1800+ (1.5 GHz) running uniprocessor:
description giga taps/sec cycles/tap
=========== ============ ============
unrolled C 0.847 1.77
SSE simple 1.01 1.48
SSE unrolled 1.07 1.40
3DNow! simple 1.25 1.20
3DNow! unrolled 1.4 1.07
Pentium 4 (1.7 GHz):
description giga taps/sec cycles/tap
=========== ============ ============
unrolled C 0.631 2.7
SSE simple 1.28 1.32
SSE unrolled 1.7 1.0
The giga taps/sec column measures absolute performance. Big is
better.
Cycles/tap is the processor clock speed divided by taps/sec.
It is a normalized figure of merit. Small is better.
[Prev in Thread] |
Current Thread |
[Next in Thread] |
- [Discuss-gnuradio] floating point dot product benchmark results,
Eric Blossom <=