discuss-gnuradio
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Maximum Number of Bins


From: Criss Swaim
Subject: Re: Maximum Number of Bins
Date: Mon, 2 Nov 2020 10:39:50 -0700
User-agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:78.0) Gecko/20100101 Thunderbird/78.4.0

Thank you Marcus & Marcus - your insights are greatly appreciated.

I am looking at the suggestions, exp the fft conversion and we are
considering upgrading, but need to see if the system will scale, as is. 
BTW, I am maintaining the code and not the original developer, so I am
not familiar with all the pieces, esp. the ones that have been working. 
I look at this as an opportunity to dig deeper into GnuRadio.

1) for clarification, we have been running the 3.7 code for 4 -5 years -
not sure when we upgraded to 3.7.9.  The system runs with 2 million fft
bins, but at 3 or 4 million, it fails.  M. Leach has demonstrated that
without our custom block, GnuRadio can process the high bin levels.  I
have run various configurations of our model (without the bin_sub_avg
python block) but I still receive the error.

2) Both of you have mentioned we are using old message queues...Can you
point me to some documentation that explains this. We are using the
blocks.add_const_vff and connect functions to remove background constant
(a numpy array) from the signal stream.  What would be a better
approach? I have not looked at this for several years, so I need to
refresh and this would be a good time to look at alternate options.  It
is bit of a black box for me and I would like to research alternate
approaches as I dig into this process.

3) M. Leach: you indicated that the conversions from a
stream->string->numpy array is very inefficient.  Can you point to
another approach to convert a stream to numpy array?  This is done once
every 60 minutes, but still if it could be improved, that would help.

4) Finally, I have also been looking for a change log for the 3.7 to 3.8
system. Moving from 3.6 to 3.7 was a significant change and was
wondering if 3.7 to 3.8 is the same level of effort for custom blocks. 
Also, is there a timeline for 3.9?

Again, thanks for any guidance.

Criss Swaim
cswaim@tpginc.net
cell: 505.301.5701

On 10/31/2020 9:55 AM, Marcus Müller wrote:
> Hi Craig, hi Marcus,
>
> Also, just because I need to point that out:
>
> GNU Radio 3.7 is really a legacy series of releases by now. You should
> avoid using it for new developments - it's getting harder and harder to
> even build it on modern versions of Linux. In fact, a lot of its
> dependencies simply don't exist for modern systems anymore.
> Developing for 3.7 is hence dangerous in terms of lifetime. That's among
> the chief reasons why we released 3.8. Took us long enough!
>
> 3.7.9.2 is positively ancient. A 3.7.13.4 or later should be the oldest
> version of GNU Radio you work with, even when maintaining old code.
>
> Other than that:
>
>> Oct 29 10:45:07 tf kernel: analysis_sink_1[369]: segfault at
>> 7f9c5a7fd000 ip 00007f9dd9361d43 sp 00007f9c5a48a638 error 6 in
>> libgnuradio-vandevender.so[7f9dd9336000+4d000]
> This really looks like a bug in your code!
> These happen easily with the older style msgq that you seem to be using
> (we've basically all but removed these in current development versions
> of GNU Radio), especially if directly interfacing with Python land,
> which has different ideas of object lifetime than your C++ code might
> have...
> I think a slight reconsideration of your software architecture might
> help here, but I've not seen your overall code.
>
>> With an FFT size of 2**22 bins.  This took about 20 seconds for the
>> FFTW3 "planner" to crunch on, but after that, worked
>>   just fine within the flow-graph.
>>
> Not quite 20s for me, but yes, single-threaded FFT performance was about
> 14 transforms of that size per second, 2 threads allowed for ~23
> transforms a second, 4 threads for about 28. Knowing GNU Radio, I'd
> recommend you rather stick with a single thread per transform, because
> other block also have CPU requirements (if you really want to increase
> throughput, deinterleave vectors and have multiple single-threaded FFTs
> run in parallel, then recombine after).
>
> Seeing that you you only need 20 MS/s, and 14 transform are 14 · 2²² =
> 7·2²³ samples a second and that would be roughly 56 MS/s, I think you
> are fine. If you're not, get a faster PC, honestly!
>
> Best regards,
> Marcus M (the younger Marcus)
>
> On 29.10.20 23:53, Marcus D. Leech wrote:
>> On 10/29/2020 06:03 PM, Criss Swaim wrote:
>>> we are running version 3.7.9.2
>>>
>> I constructed a simple flow-graph in GR 3.7.13.5
>>
>> osmosdr_source--->stream-to-vector-->fft-->null-sink
>>
>> With an FFT size of 2**22 bins.  This took about 20 seconds for the
>> FFTW3 "planner" to crunch on, but after that, worked
>>   just fine within the flow-graph.
>>
>> You should really keep your FFT sizes to a power-of-2, particularly at
>> this size range.  That's not related to your problem
>>   directly, but power-of-2 FFTs have lower computational complexity. 
>> Among other things, the FFTW2 "planner" for
>>   non-power-of-2 FFTs at these eye-watering FFT sizes seems to take a
>> LONG time to compute a "plan".
>>
>> You should probably look at restructuring your code--looks like you're
>> using message queues and marshaling your samples
>>   as *string* data through those queues.  While it shouldn't necessarily
>> Seg Fault, it's not a terribly efficient way of doing things.
>>
>>
>>> Criss Swaim
>>> cswaim@tpginc.net
>>> cell: 505.301.5701
>>> On 10/29/2020 11:37 AM, Marcus D. Leech wrote:
>>>> On 10/29/2020 01:17 PM, Criss Swaim wrote:
>>>>> I have attached a png of the flow graph and the error msgs from the
>>>>> system log are below.  These error messages are the only messages.
>>>>>
>>>>>> Oct 29 10:45:26 tf abrt-hook-ccpp[378]: /var/spool/abrt is
>>>>>> 23611049718 bytes (more than 1279MiB), deleting
>>>>>> 'ccpp-2020-10-27-15:30:43-28474'
>>>>>> Oct 29 10:45:07 tf abrt-hook-ccpp[378]: Process 329 (python2.7) of
>>>>>> user 1000 killed by SIGSEGV - dumping core
>>>>>> Oct 29 10:45:07 tf audit[370]: ANOM_ABEND auid=1000 uid=1000
>>>>>> gid=1000 ses=8656
>>>>>> subj=unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023 pid=370
>>>>>> comm="copy11" exe="/usr/bin/Oct 29 10:45:07 tf audit[369]:
>>>>>> ANOM_ABEND auid=1000 uid=1000 gid=1000 ses=8656
>>>>>> subj=unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023 pid=369
>>>>>> comm="analysis_sink_1" exe="Oct 29 10:45:07 tf kernel: traps:
>>>>>> copy11[370] general protection ip:7f9e0acfdee0 sp:7f9c5a7fb590
>>>>>> error:0 in libpthread-2.22.so[7f9e0acf1000+18000]
>>>>>> Oct 29 10:45:07 tf kernel: analysis_sink_1[369]: segfault at
>>>>>> 7f9c5a7fd000 ip 00007f9dd9361d43 sp 00007f9c5a48a638 error 6 in
>>>>>> libgnuradio-vandevender.so[7f9dd9336000+4d000]
>>>>> Flow is USRP -> stream to vector -> fft -> complex to mag ->
>>>>> bin_sub_avg -> analysis_sinkf
>>>>>
>>>>> bin_sub_avg (python) & analysis_sinkf (c/c++) are custom blocks.
>>>>>
>>>>> the function of Bin Sub Avg, which is written in Python, is to start
>>>>> a background task which periodically (in this case hourly) samples
>>>>> the input signal, calculates the background noise and subtracts it
>>>>> from the signal that is passed the the Analysis_sinkf module.
>>>>>
>>>>> Analys_sinkf monitors each bin and only when specific thresholds for
>>>>> the bin are met (ie duration, strength) is the signal written out to
>>>>> a signal file.  Signals not passing the criteria are dropped.
>>>>>
>>>>> This code base has been running for over 3 years, with the original
>>>>> system implementation about 8/9 years ago.
>>>>>
>>>>> I have traced the problem to the input signal into bin_sub_avg when
>>>>> the number of fft bins is 3 million (2 million works).  At 3 million
>>>>> bins, any reference to the result of the delete_head() function in
>>>>> the python code causes a failure.  The python code just fails
>>>>> without a traceback, then the invalid data stream is passed to the
>>>>> analysis_sinkf module which is C/C++ and it causes the segment fault.
>>>>>
>>>>> Thus my suspicion is there is a limit in the fft block on the number
>>>>> of bins it can handle and some variable is overflowing, but this is
>>>>> a guess at this point.  There may be a restriction in the
>>>>> gr.signature_io module, but that seems unlikely.
>>>>>
>>>>>
>>>> What version of Gnu Radio is this?
>>>>
>>>>



reply via email to

[Prev in Thread] Current Thread [Next in Thread]