discuss-gnuradio
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Discuss-gnuradio] SOCIS project update 9


From: Martin Braun
Subject: Re: [Discuss-gnuradio] SOCIS project update 9
Date: Fri, 31 Jul 2015 09:50:03 -0700
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.8.0

Johannes,

you forgot to mention you will presenting your stuff at GRCon in
Washington DC in a few weeks :)

Cheers,
Martin

On 31.07.2015 02:50, Johannes Demel wrote:
> Hey community!
> 
> Here we go again. Another project update.
> I'm working with VOLK and SIMD for two weeks now. I could fix some
> hiccups with last weeks pack and unpack kernels. They run just fine
> during test now.
> Also, I added a 'volk_8u_x3_encodepolar_8u_x2' kernel. It operates on
> the the assumption that there is one active bit in a byte and it is
> located in the LSB. A quick performance test with a 2^32 samples head
> block after the encoder shows that generic crunches ~160MSps. So far I
> had an encoder which operated on packed bytes and did ~300MSps. An
> unpack block was added to the flowgraph with the 'extended_encoder' in
> use. The vector optimized version does ~570MSps. So it is ~3.5x as fast
> as the generic version. Some more optimization might yield even better
> results.
> At first glance it is weird that the output signature of the encoder is
> '8u_x2'. The kernel internally needs a temporary buffer which has the
> same size as the output buffer. Instead of malloc'ing and free'ing it on
> every call, it can be created once and be used all the time.
> During the week I was struggling with VOLK tests. Finally I solved those
> issues. But I'd like to refer to the mail I sent out the other day.
> SIMD code tends to have quite a few lines of code. In order to make it
> easier to read and understand, it would be great if it was possible to
> implement multiple functions within one '#ifdef LV_HAVE_ARCH ... #endif'
> paragraph. But so far the compiler refuses to compile if I did this. It
> is possible to add functions in the general section but that's only
> appropriate for a generic kernel or common functions.
> All the intrinsics I used so far are available on SSSE3. Although, I
> created aligned and unaligned versions of those kernels only store[u]
> and load[u] might make a difference here. My benchmarks don't show any
> significant difference. All benchmarks are done on a Sandy Bridge i7.
> 
> I suspect the encoder was easier to optimize than the decoder will be.
> So for the upcoming week and beyond I will focus on creating kernels for
> polar decoding.
> 
> More info and current project progress can be found in [1], [2] and [3].
> 
> Cheers
> Johannes
> 
> [1] https://github.com/jdemel/gnuradio
> [2] https://github.com/jdemel/socis-proposal
> [3] https://github.com/jdemel/volk
> 
> _______________________________________________
> Discuss-gnuradio mailing list
> address@hidden
> https://lists.gnu.org/mailman/listinfo/discuss-gnuradio
> 




reply via email to

[Prev in Thread] Current Thread [Next in Thread]