[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Discuss-gnuradio] Try to improve E100's performance at high sample

From: Almohanad Fayez
Subject: Re: [Discuss-gnuradio] Try to improve E100's performance at high sample rate
Date: Tue, 24 Jan 2012 13:33:33 -0500 (EST)

I haven't used VOLK with the OMAP processor but from my experience with the E100 every multiplication and/or division in your flowgraph counts ... When I was working on my C64x+ DSP based FM receiver on the E100 I was moving individual blocks 1-by-1 from the GPP to the DSP and almost every multiplication/division on the GPP caused a buffer overflow my impression at least is if you're going for a pure GPP implementation you need to make used of NEOS vector operations and if you're using a DSP based solution you'll need to find a way to speed up the GPP/DSP buffers, which is something I'm hoping to have more time to look into.

Almohanad Fayez

-----Original Message-----
From: Evan Merewether <address@hidden>
To: discuss-gnuradio <address@hidden>
Sent: Tue, Jan 24, 2012 1:22 pm
Subject: Re: [Discuss-gnuradio] Try to improve E100's performance at high sample rate

Has anybody looked at using the CORDIC approximation for atan2?  Depending
on the required accuracy, this may dramatically improve performance in your
C code. Ultimately, you can implement the CORDIC functions in the FPGA
(quasi math-coprocessor style) which would then give you the fastest
possible computation speed.


-----Original Message-----
From: discuss-gnuradio-bounces+evan=address@hidden
[mailto:address@hidden] On Behalf Of
Sent: Tuesday, January 24, 2012 10:56 AM
To: Nick Foster
Cc: address@hidden
Subject: Re: [Discuss-gnuradio] Try to improve E100's performance at high
sample rate

On 01/19/2012 07:13 PM, Nick Foster wrote:
> Optimizing an algorithm is a hard and sometimes counterintuitive
> process. You might benchmark the following:
> - Gnuradio's atan2 WITHOUT any Volk multiplications (just comment out
> the volk mults in your block)
> - The Volk multiplications WITHOUT Gnuradio's atan2 (just comment out
> the atan2 in your block)
> This will let you determine where the bottleneck is. In addition, try
> running over a MUCH larger dataset. The clock resolution at <1ms is
> not very good and the scheduler will have a correspondingly larger
> effect at smaller timescales.
> I think you'll find the atan2 part takes vastly longer than the
> multiplications do, and that will be where you have to look for
> performance improvements.
> --n

Hi Nick,

I have been doing some tests about the demodulation module. As you said,
the atan2 part takes much longer than the multiplication. So in order to
maximize the performance improvement that volk could bring to the
processing, I took a division and a multiplication out of atan2, and use
volk-ified divider and multiplier instead. Then I run tests using a much
larger dataset.

But from the test results, I did not observe a performance improvement.
In fact, the average processing time even increase a little bit. So I
was wondering if what I did was not a good way to improve the performance?

Another issue is that when I ran Cmake to build Gnuradio on E100, it
reported this:
-- Available arches: generic;neon
-- Available machines: generic;neon
-- Did not find liborc and orcc, disabling orc support...

But from the "opkg list-installed | grep orc" check, both orc and liborc
are installed. Could this lack of orc support be part of the reason why
my implementation did not have a performance improvement?

I will appreciate it if you could give me a hand on this. Thanks.

Best Regards,


Discuss-gnuradio mailing list
No virus found in this message.
Checked by AVG - www.avg.com
Version: 2012.0.1901 / Virus Database: 2109/4763 - Release Date: 01/24/12

Discuss-gnuradio mailing list

reply via email to

[Prev in Thread] Current Thread [Next in Thread]