[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Discuss-gnuradio] Using volk in Mac: test report

From: Tom Rondeau
Subject: Re: [Discuss-gnuradio] Using volk in Mac: test report
Date: Mon, 27 Feb 2012 11:01:34 -0500

On Thu, Feb 23, 2012 at 9:08 PM, Nowlan, Sean <address@hidden> wrote:

Hi Tom,


I tested with your merged branch. No segfault and same tests fail as expected.


I noticed several weird numbers in the orc results. Some of them correspond to the failed cases. Do you know what is causing these? Printf formatting issue? Hitting bounds of float type and wrapping around? Relevant output:

So these test results are really interesting. The time results are one thing, but it might be that the numerical results you pointed out are hopefully easy to fix. In the volk_32fc_s32f_magnitude_16i_a block, it could be an issue with rounding. At first, I was thinking it was the direction of the rounding operation, but the documents from ARM say that the only mode is to round to nearest neighbor and that other modes are disabled (which is what we've set the SSE to, as well). Perhaps instead of truncation when converting to 16-bit shorts it rounds first.

The 32fc_x2_multiply_32fc_a kernel looks like all of the numbers are correct. This is probably just a precision thing and we're asking for the numbers to be exact for way too many decimal places.

Hopefully I'll get access to an E100 soon to look into this more.



RUN_VOLK_TESTS: volk_16ic_deinterleave_16i_x2_a

generic completed in 42.47s

orc completed in 3.10883e-39s

Best arch: orc


RUN_VOLK_TESTS: volk_16ic_s32f_deinterleave_32f_x2_a

generic completed in 17.28s

orc completed in -3.90577e+11s

Best arch: orc


RUN_VOLK_TESTS: volk_32fc_s32f_magnitude_16i_a

generic completed in 4.37s

orc completed in 1.35136e+09s

offset 1107 in1: 29281 in2: 29282

offset 1187 in1: -27601 in2: -27600

offset 1522 in1: -31248 in2: -31249

offset 2396 in1: 26146 in2: 26145

offset 2486 in1: 25394 in2: 25393

offset 4084 in1: 16452 in2: 16451

offset 5052 in1: 28692 in2: 28691

offset 5296 in1: 30869 in2: 30868

offset 5467 in1: -32706 in2: -32705

offset 6388 in1: 19556 in2: 19557

volk_32fc_s32f_magnitude_16i_a: fail on arch orc

Best arch: generic


RUN_VOLK_TESTS: volk_32fc_magnitude_32f_a

generic completed in 35.24s

orc completed in 6.93125e+10s

Best arch: generic


RUN_VOLK_TESTS: volk_32fc_x2_multiply_32fc_a

generic completed in 52.66s

orc completed in -3.4978e+12s

offset 3 in1: 0.382086 in2: 0.382086

offset 4 in1: 0.496706 in2: 0.496706

offset 8 in1: 0.170967 in2: 0.170967

offset 10 in1: 0.165878 in2: 0.165878

offset 14 in1: 0.398192 in2: 0.398192

offset 15 in1: 0.492358 in2: 0.492358

offset 17 in1: 0.568251 in2: 0.568251

offset 19 in1: 0.0630723 in2: 0.0630723

offset 20 in1: 0.251459 in2: 0.251459

offset 22 in1: 0.348539 in2: 0.348539

volk_32fc_x2_multiply_32fc_a: fail on arch orc

offset 0 in1: 0.140486 in2: 0.140486

offset 1 in1: 0.691375 in2: 0.691375

offset 5 in1: 0.63745 in2: 0.63745

offset 11 in1: 0.644697 in2: 0.644697

offset 14 in1: 0.858205 in2: 0.858205

offset 15 in1: 0.94011 in2: 0.94011

offset 16 in1: 0.490713 in2: 0.490713

offset 18 in1: 0.190573 in2: 0.190573

offset 19 in1: 0.0226408 in2: 0.0226408

offset 20 in1: 0.895774 in2: 0.895774

volk_32fc_x2_multiply_32fc_a: fail on arch orc

offset 1 in1: 0.524585 in2: 0.524585

offset 2 in1: 0.236218 in2: 0.236218

offset 6 in1: 0.733853 in2: 0.733853

offset 9 in1: 0.290247 in2: 0.290247

offset 11 in1: 0.529422 in2: 0.529422

offset 12 in1: 0.180218 in2: 0.180218

offset 14 in1: 0.496568 in2: 0.496568

offset 15 in1: 0.0297472 in2: 0.0297472

offset 19 in1: 0.351138 in2: 0.351138

offset 20 in1: 0.300737 in2: 0.300737

volk_32fc_x2_multiply_32fc_a: fail on arch orc

Best arch: generic




From: address@hidden [mailto:address@hidden] On Behalf Of Tom Rondeau
Sent: Thursday, February 23, 2012 11:18 AM

To: Nowlan, Sean
Cc: Nick Foster; address@hidden
Subject: Re: [Discuss-gnuradio] Using volk in Mac: test report


On Wed, Feb 22, 2012 at 10:19 PM, Nowlan, Sean <address@hidden> wrote:

I confirmed this works on E100 insofar as I no longer get a segfault on volk_32fc_s32fc_multiple_32fc_a. But volk_32fc_s32f_magnitude_16i_a and volk_32fc_x2_multiply_32fc_a still fail as expected.






I just merged Nick's branch into my safe_align branch. Can you check that one out and test when you get a chance? I just want to make sure we're all on the same branch here. And please post the output of 'ctest -V -R volk'.







From: address@hidden [mailto:address@hidden] On Behalf Of Tom Rondeau
Sent: Tuesday, February 21, 2012 6:49 PM
To: Nick Foster
Cc: Nowlan, Sean; address@hidden

Subject: Re: [Discuss-gnuradio] Using volk in Mac: test report


On Tue, Feb 21, 2012 at 6:43 PM, Nick Foster <address@hidden> wrote:

Tom, Sean,


There's a couple of things here. First, the Orc volk_32fc_s32f_magnitude_16i_a function is rounding differently than the generic versions on E100 for some reason. Not fatal, totally usable, but it makes the QA code fail. Second, the volk_32fc_x2_multiply_32fc_a looks like it's working fine but the thresholds are too close in the comparison function, which is strange because it uses the same threshold I use everywhere else. I'll keep looking into that. In any case, they're fine for use in Volk as-is.


I think the segfault in volk_32fc_s32fc_multiply_32fc_a is being caused by a bug in the profiler code as well. It's not correctly handling complex scalars. The function itself doesn't actually work either, which doesn't help, but it wasn't caught because the profiler code was buggy...


Tom, I pushed a fix to my github under "volk_fix". For now I've disabled volk_32fc_s32fc_multiple_32fc_a since I can't figure out a clean way to get it to work under Orc; I had a misunderstanding of how float parameters are handled inside array operations. I also added complex scalar handling. I'll keep looking into solving this one for real but this will get things working for now.





Thanks a ton for working on this. I'll merge your branch asap.





On Sat, Feb 18, 2012 at 1:05 PM, Tom Rondeau <address@hidden> wrote:

On Fri, Feb 17, 2012 at 6:04 PM, Nowlan, Sean <address@hidden> wrote:

Don’t know how helpful these are, but here you go.





It looks like a couple of functions are failing from the stdout:


volk_32fc_s32f_magnitude_16i_a: fail on arch orc

volk_32fc_x2_multiply_32fc_a: fail on arch orc


These are both the Orc implementations of the functions, which seem to work fine on my Intel processors. I don't have access to an OSX box or an E100, so I can't really test this out. The files you sent me don't (appear to) tell me what the real problem is.


We'll need some other brave soul out there who can dig into these issues on the platforms for us.







From: address@hidden [mailto:address@hidden] On Behalf Of Tom Rondeau
Sent: Friday, February 17, 2012 5:25 PM
To: Nowlan, Sean
Cc: Nick Foster; address@hidden

Subject: Re: [Discuss-gnuradio] Using volk in Mac: test report


On Fri, Feb 17, 2012 at 5:11 PM, Nowlan, Sean <address@hidden> wrote:

I built Tom’s safe_align branch on E100 and ran volk_profile. It segfaulted on “RUN_VOLK_TESTS:volk_32fc_s32fc_multiply_32fc_a. I’ll get a stack trace for you.




Really interesting that it's the same block. Hopefully, it's a single, simple fix. I'll look into it when you can get me the stack trace.


Thanks for reporting!





From: discuss-gnuradio-bounces+sean.nowlan=address@hidden [mailto:discuss-gnuradio-bounces+sean.nowlan=address@hidden] On Behalf Of Tom Rondeau
Sent: Friday, February 17, 2012 2:33 PM
To: Nick Foster
Cc: address@hidden
Subject: Re: [Discuss-gnuradio] Using volk in Mac: test report


On Fri, Feb 17, 2012 at 2:30 PM, Nick Foster <address@hidden> wrote:

On Fri, Feb 17, 2012 at 11:20 AM, Carles Fernandez <address@hidden> wrote:

Thanks for the inputs!

We are interested in determining the best architecture at instantation
time. What would be the best strategy? We though about running the
same operations several times for each architecture, measure the
results and use the fastest one for the processing blocks. Would this
be the right approach?




Run volk_profile. It does exactly what you said, and writes the results to ~/.volk/volk_config. Volk reads this file when it is involked (sorry) to determine which particular function to execute. So all you do is run volk_profile once on any given machine, and it's optimized.





This is discussed on the webpage:


We'll be updating this as things progress with Volk, but the profiler info is there already.









reply via email to

[Prev in Thread] Current Thread [Next in Thread]