[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Discuss-gnuradio] VOLK division between complexes

From: Federico Larroca
Subject: Re: [Discuss-gnuradio] VOLK division between complexes
Date: Fri, 13 May 2016 15:59:00 -0300

Thank you Andy. However, I only need the division, although this is indeed a good idea if more operations were needed.

So far, I've applied the following lines with some significant savings (w.r.t. a loop):

volk_32fc_x2_multiply_conjugate_32fc(&c[0], &a[0], &b[0], N); // c = a*conj(b)
volk_32fc_magnitude_squared_32f(&mag_sq_b[0], &b[0], N); // mag_sq_b = |b|^2
volk_32f_x2_divide_32f(&inv_mag_sq_b[0], &ones[0], &mag_sq_b[0], N); // inv_mag_sq_b = 1/|b|^2, since I've previously defined ones as an array containing N ones.
volk_32fc_32f_multiply_32fc(&out[0], &c[0], &inv_mag_sq_b[0], N); // out = c*inv_mag_sq_b = a*conj(b)/|b|^2 = a/b

The idea of using VOLK's pow operator is significantly slower.

I've also experienced interesting performance improvements by simplifying some for loops not amenable to VOLK, as suggested by Marcus. On the other hand, I'm crazy enough to try to implement a VOLK kernel that performs the division. I've just started, don't know if I'll be successful, but guess I'll learn something anyhow.


2016-05-13 15:14 GMT-03:00 Andy Walls <address@hidden>:
On Thu, 2016-05-12 at 16:24 -0400, address@hidden
> Date: Wed, 11 May 2016 16:09:56 -0300
> From: Federico Larroca
> To: address@hidden
> Subject: [Discuss-gnuradio] VOLK division between complexes

> Hello everyone,
> We are on the stage of optimizing our project (gr-isdbt). One of the
> most consuming blocks is OFDM synchronization, and in particular the
> equalization phase. This is simply the division between the input
> signal and the estimated channel gains (two modestly big arrays of
> ~5000 complexes for each OFDM symbol).
> Until now, this was performed by a for loop, so my plan was to change
> it for a volk function. However, there is no complex division in VOLK.
> So I've done a rather indirect operation using the property that a/b =
> a*conj(b)/|b|^2, resulting in six lines of code (a multiply conjugate,
> a magnitude squared, a deinterleave, a couple of float divisions and
> an interleave). Obviously the performance gain (measured with the
> Performance Monitor) is marginal (to be optimistic)...
> Does anyone has a better idea?

I have a different idea, but I doubt it is better.  The transformation

w = Log (z) = ln|z| + jArg(z)

transforms multiplication, division, power and root operations into
addition, subtraction, multiplication and division  operations

So if c = Log(a), d = Log(b), then a/b = Exp(c-d) .

If along with your complex division, you also have a lot of additional
complex multiplcation, power, and/or (real) root operations to perform,
then the transform *might* give you a savings.  A savings would also be
more likely, if you don't need to invert the transformation at the end
(i.e. no need for z = Exp(w)).


>  Implementing a new kernel is simply out of my knowledge scope.
> Best
> Federico

reply via email to

[Prev in Thread] Current Thread [Next in Thread]