discuss-gnuradio
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Discuss-gnuradio] VOLK: fast way to log10()?


From: West, Nathan
Subject: Re: [Discuss-gnuradio] VOLK: fast way to log10()?
Date: Thu, 17 Sep 2015 01:56:16 -0400

On Wed, Sep 16, 2015 at 8:20 PM, Martin Braun <address@hidden> wrote:
On 16.09.2015 13:29, West, Nathan wrote:
> There is a volk_32f_s32f_multiply_32f. It doesn't operate in-place, but
> almost none of the VOLK kernels do. I think it's safe to give the same
> output buffer as input buffer. (I've heard that doing stuff in-place is
> noticeably better, but I've never tested this and I'm a tad skeptical.
> I'll buy someone a beer whenever I see them if they prove me wrong with
>>= 5 kernels)

I've nagged people about this before, but I'd like to make this an
official thing: Put this into the VOLK docs (i.e. state in the contract
that in- and output buffers may be the same) and then include that in
the unit tests, so we don't end up with some arcane ISA that will not
allow this. In-place VOLK calls are very useful for many blocks, and
I've shied away from using them myself in the past just because I wanted
to be sure they'll work in the future.

M

I think it's reasonable to make a contract that in and out buffers may be the same and to document that.

Putting it in the QA is a tad more difficult since there are cases where kernels have multiple input buffers and other cases where kernels have multiple output buffers-- which ones do we test? For single input and single output buffers that's reasonable, but outside of that it becomes a large number of permutations. There is a GCC keyword _restrict (or _restrict_) that suppsedly lets GCC do some optimizations if pointers are not aliased. Since we never use this keyword it's OK to use the same buffer unless there's some funky kernel (branch_16i_branch_4_state_8 comes to mind) in which case you probably are aware of what you're doing.

Finally, I want to clarify what I meant by in-place operations. I was referring to the type of signature Dennis referred to: volk_32f_s32f_multiply(float* buffer, float scalar, num_points) where the input and output buffers are explicitly the same and we always write the result in the memory location where the input came from. I don't consider the use of kernels such as the existing log2 with input buffers == output buffers to be in-place since the compiler is not aware that they are pointing to the same memory location.

nw

reply via email to

[Prev in Thread] Current Thread [Next in Thread]