|
From: | Marcus Müller |
Subject: | Re: [Discuss-gnuradio] the block “complex to Arg” in gnuradio |
Date: | Tue, 10 Nov 2015 23:35:09 +0100 |
User-agent: | Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.2.0 |
Hi Johannes, Hi xd, complex_to_arg uses GNU Radio's fast_atan2f function, which is an approximation [1]. Between the 255 values of the lookup table, it uses linear interpolation, hence your 0.4 error factor. As Johannes said, that's not really surprising for a look up table-based approach. I do think using this approximation is justified, but I also think that the codebase it uses has been obsolete for a bit now: gr::fast_atan2 could be replaced by volk's volk_32fc_s32f_atan2_32f, which has been around since 2012, but hasn't seen any use in GNU Radio, as far as I can tell. Now, I went ahead and had a benchmark [2] which showed that gr::fast_atan2 is actually quite fast -- but that's only twice as fast as the standard been-around-forever libc implementation and the volk implementation (which, admittedly, also does a multiplication with 1.0, and by the way: the generic volk kernel (which does libc atan2 + multiplication) is exactly as fast as the SSE4 one on my machine), and everything is pretty much in the same range as C++ <complex>'s std::arg : For 2²⁵ complex numbers, of which at least half have small angles: 1: .fast: 1: 0.397261s wall, 0.370000s user + 0.020000s system = 0.390000s CPU (98.2%) 1: 1: .volk: 0.780515s wall, 0.760000s user + 0.020000s system = 0.780000s CPU (99.9%) 1: 1: .libc: 0.777738s wall, 0.760000s user + 0.020000s system = 0.780000s CPU (100.3%) 1: 1: .c++ complex arg: 0.815700s wall, 0.780000s user + 0.030000s system = 0.810000s CPU (99.3%) But: this is on an Intel i7. Things might look different on your average android phone or even worse, your raspberry Pi (so if you wanna test, [2] ). Conclusion: If you're after small angles, the current complex_to_arg's factor 2 speedup might not be what your after. That is probably not the case if you use complex_to_arg in an quadrature_demod inside an FM audio receiver running on an embedded device -- small angle errors don't make the least difference here. The question is, like it was with gr::random, whether we still prefer performance over preciseness, or if we excercise exactness. Also, I was pretty amazed how fast fast_atan2 really is – its dependence on branching suggests it's pretty hard to vectorize and optimize as a compiler. Best regards, Marcus [1] https://gnuradio.org/doc/doxygen/group__misc.html#ga6c1470346a3524989b7a8a3639aa79a7 [2] On 10.11.2015 10:45, Johannes Demel wrote: Hi,> > _______________________________________________ > Discuss-gnuradio mailing list > address@hidden > https://lists.gnu.org/mailman/listinfo/discuss-gnuradio |
[Prev in Thread] | Current Thread | [Next in Thread] |