discuss-gnuradio
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Discuss-gnuradio] First integration (was: Re: Run graph/ scheduler


From: Dennis Glatting
Subject: Re: [Discuss-gnuradio] First integration (was: Re: Run graph/ scheduler overhead)
Date: Sat, 25 Jul 2015 00:57:34 -0700

On Fri, 2015-07-24 at 23:30 -0700, Richard Bell wrote:
> This topic of hardware efficient designs vs general purpose processor
> (GPP) efficient is very interesting. I assumed the most efficient
> hardware designs would tend to translate well into a GPP, because most
> good hardware designs are good because they parallelize things like
> crazy. You would think that would payoff in a GPP as well, up to the
> number of cores available. After reading this and Toms blog on FFT
> filters vs FIR filters vs Polyphase filters, it seems not to be the
> case in a lot of situations. My basic intuition makes me think now that
> the GPP is very similar to a serial process when compared to an FPGA.
> For this reason it is worth focusing a lot of energy on efficient
> serial designs and not as much on massively parallel ones. 
> 
> Is there literature on what metrics matter most when designing
> algorithms for a GPP vs an FPGA, or is it mostly something you learn
> through experience? The GPP has so many variables that come along with
> it that I imagine it's hard to do much other then implement and test.
> Fun to think about. 
> 


>From a pseudo-GPP perspective there is a lot of interesting work but
portability remains an issue. For example, I am using OpenMP in some of
my blocks for signal detection/qualification and error
correction/validation. OpenMP works great under gcc but is near
non-existent under clang, which is the default compiler on FreeBSD. I've
had limited success under ARM/embedded, too.

OpenCL is interesting but I've done little work. There are many trade
offs and I'd like to see how OpenCL can be practically applied to
GNURadio. What would be interesting under ARM/embedded systems is to
offload some of the work to the graphics chip from limited cores and
speed (my MinnowBoard has only two cores). 

I built a hash cracker (three HD290Xs) and one of the lessons was
graphics chips typically aren't used under heavy load for long
durations, so cooling (mine now liquid cooled) and power consumption
require additional attention. My CubieBoard CC-A80 had the same problem
running GNURadio where I had to mount a tiny fan onto the CPU's heat
sink to keep it from going into thermal shutdown.

The DC Block is pretty much serial -- it's mostly a chain of moving
averages. All I could really do is improve the efficiency of the code.
There's probably better algorithms.



> Rich
> 
> Sent from my iPad
> 
> > On Jul 24, 2015, at 10:51 PM, Dennis Glatting <address@hidden> wrote:
> > 
> > 
> > 
> >> If you can put together a patch that gives us a bit of a boost here,
> >> that'd be great. But as you say, it doesn't look like this algorithm
> >> as it is will ever be fantastically fast. It was definitely meant more
> >> for hardware than this case.
> > 
> > My first attempt at integration sees a performance improvement from
> > 1.6ms to <510us (roughly -68%) according gr-ctrlport-monitor (i.e., "avg
> > work time"). For this attempt I integrated the templates where the
> > existing class is merely a wrapper (header and code body example below)
> > thereby keeping the flavor of the original class interfaces, not to
> > mention a certain amount of laziness on my part.
> > 
> > Integrated but not tested is float. At least for the gr_complex case, my
> > application shows squiggles on the QT GUI Sink.
> > 
> > Oh, this is NOT C++11.
> > 
> > 
> > 
> > Header:
> > 
> > namespace gr {
> >  namespace filter {
> > 
> >    class FILTER_API dc_blocker_cc_impl : public dc_blocker_cc {
> > 
> >    private:
> > 
> >      dc_blocker_t<gr_complex> d_the_real_me;
> > 
> >    public:
> > 
> >       dc_blocker_cc_impl(int D, bool long_form);
> >      ~dc_blocker_cc_impl();
> > 
> >      int group_delay();
> > 
> >      int work(int noutput_items,
> >               gr_vector_const_void_star &input_items,
> >               gr_vector_void_star &output_items);
> > 
> >    };.75
> > 
> >  } /* namespace filter */
> > } /* namespace gr */
> > 
> > 
> > Code body:
> > 
> >    dc_blocker_cc_impl::dc_blocker_cc_impl( int D, bool long_form )
> >      : sync_block( "dc_blocker_cc",
> >                    io_signature::make (1, 1, sizeof(gr_complex)),
> >                    io_signature::make (1, 1, sizeof(gr_complex))),
> >        d_the_real_me( D, long_form ) {
> > 
> >    }
> > 
> >    dc_blocker_cc_impl::~dc_blocker_cc_impl() {}
> > 
> >    int
> >    dc_blocker_cc_impl::group_delay() {
> > 
> >      return d_the_real_me.group_delay();
> >    }
> > 
> >    int
> >    dc_blocker_cc_impl::work( int                        noutput_items,
> >                              gr_vector_const_void_star& input_items,
> >                              gr_vector_void_star&       output_items) {
> > 
> >      return d_the_real_me.work( noutput_items, 
> >                                 input_items, 
> >                                 output_items );
> >    }
> > 
> > 
> > 
> > 
> > _______________________________________________
> > Discuss-gnuradio mailing list
> > address@hidden
> > https://lists.gnu.org/mailman/listinfo/discuss-gnuradio
> 





reply via email to

[Prev in Thread] Current Thread [Next in Thread]