[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Discuss-gnuradio] QT GUI time sink (float) unnecessary memmove()?

From: Tom Rondeau
Subject: Re: [Discuss-gnuradio] QT GUI time sink (float) unnecessary memmove()?
Date: Sat, 28 Mar 2015 11:12:15 -0700

On Sat, Mar 28, 2015 at 11:00 AM, Andy Walls <address@hidden> wrote:

Can this memmove() be safely skipped


if ((d_start == 0) || (gr::high_res_timer_now() - d_last_time > d_update_time))?

I think it can, but I am not sure.

With some high throughput, high sample rate flowgraphs, ps shows my
time_sink_f thread taking up ~41% of a CPU and oprofile shows memmove()
as the number 2 pig on the list:

CPU: Intel Sandy Bridge microarchitecture, speed 3.5e+06 MHz (estimated)
Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit mask of 0x00 (No unit mask) count 90000
samples  %        image name               symbol name
24498    33.9825  libvolk.so.0.0.0         volk_32f_convert_64f_u_avx
14278    19.8058  libc-2.18.so             __memmove_ssse3_back
7488     10.3870  no-vmlinux               /no-vmlinux
2668      3.7009  libgnuradio-qtgui-3.7.7git.so.0.0.0 gr::qtgui::time_sink_f_impl::_test_trigger_slope(float const*) const
2073      2.8756  libpthread-2.18.so       pthread_mutex_lock

The volk_32f_convert_64f_u_avx() call is unavoidable as Qwt wants
doubles for plotting and not floats. But it might also be able to be
deferred to the very end when the decision to plot is known for sure.
(But that's more surgery than I care to take on at the moment.)


The for loop there is in case we're triggering with a delay set, so that sets d_start into the buffers. But we pass the vector of buffers to the plotting widget, which will start at index 0. There are a couple of things that could work here. We add an argument to TimeUpdateEvent that adds the index value. We could also only do the memcpy if d_start > 0. But thinking about the volk convert function, that's both copying the data from the input buffer into the internal buffer as well as performing the conversion. We can't just hold data in the input since we don't want to back up the data until we're ready to plot both with timing and with a full enough buffer -- it's just sampling a section at a time and drops everything in between. That part could be converted into a memcpy instead of the volk convert. Then, when we're ready to plot, we call the volk convert that also does the move from d_start to 0, so it combines those two elements.

Thoughts on those proposals?


reply via email to

[Prev in Thread] Current Thread [Next in Thread]