[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Discuss-gnuradio] SCC v. CCC filter performance
From: |
Eric Blossom |
Subject: |
Re: [Discuss-gnuradio] SCC v. CCC filter performance |
Date: |
Thu, 16 Jun 2005 13:20:10 -0700 |
User-agent: |
Mutt/1.5.6i |
On Thu, Jun 16, 2005 at 02:42:25PM -0500, David Carr wrote:
> Matt,
>
> In an earlier post your mentioned that the CCC filter implementations
> had been considerably improved. Here are the results on my machine
> (2.4GHz P4 no HT)
>
> address@hidden tests $ ./benchmark_dotprod
> generic: taps: 256 input: 4e+07 cpu: 11.630 taps/sec: 8.805e+08
> SSE: taps: 256 input: 4e+07 cpu: 4.563 taps/sec: 2.244e+09
> address@hidden tests $ ./benchmark_dotprod_ccc
> generic: taps: 256 input: 4e+07 cpu: 103.184 taps/sec: 9.924e+07
> SSE: taps: 256 input: 4e+07 cpu: 16.019 taps/sec: 6.393e+08
> address@hidden tests $ ./benchmark_dotprod_ccf
> generic: taps: 256 input: 4e+07 cpu: 100.301 taps/sec: 1.021e+08
> SSE: taps: 256 input: 4e+07 cpu: 12.830 taps/sec: 7.981e+08
> address@hidden tests $ ./benchmark_dotprod_fcc
> generic: taps: 256 input: 4e+07 cpu: 86.419 taps/sec: 1.185e+08
> SSE: taps: 256 input: 4e+07 cpu: 14.059 taps/sec: 7.284e+08
> address@hidden tests $ ./benchmark_dotprod_fsf
> generic: taps: 256 input: 4e+07 cpu: 27.738 taps/sec: 3.692e+08
> SSE: taps: 256 input: 4e+07 cpu: 21.264 taps/sec: 4.816e+08
> address@hidden tests $ ./benchmark_dotprod_scc
> generic: taps: 256 input: 4e+07 cpu: 94.179 taps/sec: 1.087e+08
> SSE: taps: 256 input: 4e+07 cpu: 31.658 taps/sec: 3.235e+08
>
> The SSRP produces short input and I usually connect it to a frequency
> xlating scf filter. While I don't have a scf benchmark, it looks like
> the the ccc filter is 2x faster than the scc filter in SSE mode. If my
> math is correct, a ccc filter requires more operations per tap than a
> scc filter?!?
Yes, but... you're adding short -> float conversion
> Could similar improvements be made in the sc* filters or
> should I cast my incoming short data stream into a complex stream?
I think what you are seeing is the cost of converting the input data
from short to float over and over in the scc filter (each input item
gets converted once for each tap in the filter). You might want to
try running your short input into gr.short_to_float and then into a
f?? filter. Think of it as hoisting a loop invariant.
Eric