discuss-gnuradio
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Discuss-gnuradio] Using volk in Mac: test report


From: Tom Rondeau
Subject: Re: [Discuss-gnuradio] Using volk in Mac: test report
Date: Thu, 1 Mar 2012 18:37:31 -0500

On Thu, Mar 1, 2012 at 5:52 PM, Tom Rondeau <address@hidden> wrote:
On Mon, Feb 27, 2012 at 11:01 AM, Tom Rondeau <address@hidden> wrote:
On Thu, Feb 23, 2012 at 9:08 PM, Nowlan, Sean <address@hidden> wrote:

Hi Tom,

 

I tested with your merged branch. No segfault and same tests fail as expected.

 

I noticed several weird numbers in the orc results. Some of them correspond to the failed cases. Do you know what is causing these? Printf formatting issue? Hitting bounds of float type and wrapping around? Relevant output:


So these test results are really interesting. The time results are one thing, but it might be that the numerical results you pointed out are hopefully easy to fix. In the volk_32fc_s32f_magnitude_16i_a block, it could be an issue with rounding. At first, I was thinking it was the direction of the rounding operation, but the documents from ARM say that the only mode is to round to nearest neighbor and that other modes are disabled (which is what we've set the SSE to, as well). Perhaps instead of truncation when converting to 16-bit shorts it rounds first.

The 32fc_x2_multiply_32fc_a kernel looks like all of the numbers are correct. This is probably just a precision thing and we're asking for the numbers to be exact for way too many decimal places.

Hopefully I'll get access to an E100 soon to look into this more.

Tom

Well, I can't seem to make anything out of those errors. They just don't make sense. Especially the volk_32fc_x2_mutliply_32fc_a, which gives what look like are the exact same results, but the tests fail. This even happens when I turn the tolerance way up.

When I run volk_profile, the same tests fail in the same way. Also, the timing numbers still don't make sense, even when running much longer tests.

Now, the good news is that after running volk_profile, those failed tests are never selected for use. So when using Volk, these errors will never effect the system at runtime. It's too bad, since we won't benefit from the speedup of Orc, but I don't think the current blocks in GNU Radio make use of these particular blocks, anyways.

I was also getting an new error running make test in gr-core-test-all. Is anyone else seeing this?

Tom

I just recompiled the gr-core-test-all and the error went away. I'm not entirely happy about that, but until I can recreate it, I think it's time to declare success and walk away...

Tom


 

RUN_VOLK_TESTS: volk_16ic_deinterleave_16i_x2_a

generic completed in 42.47s

orc completed in 3.10883e-39s

Best arch: orc

--

RUN_VOLK_TESTS: volk_16ic_s32f_deinterleave_32f_x2_a

generic completed in 17.28s

orc completed in -3.90577e+11s

Best arch: orc

--

RUN_VOLK_TESTS: volk_32fc_s32f_magnitude_16i_a

generic completed in 4.37s

orc completed in 1.35136e+09s

offset 1107 in1: 29281 in2: 29282

offset 1187 in1: -27601 in2: -27600

offset 1522 in1: -31248 in2: -31249

offset 2396 in1: 26146 in2: 26145

offset 2486 in1: 25394 in2: 25393

offset 4084 in1: 16452 in2: 16451

offset 5052 in1: 28692 in2: 28691

offset 5296 in1: 30869 in2: 30868

offset 5467 in1: -32706 in2: -32705

offset 6388 in1: 19556 in2: 19557

volk_32fc_s32f_magnitude_16i_a: fail on arch orc

Best arch: generic

--

RUN_VOLK_TESTS: volk_32fc_magnitude_32f_a

generic completed in 35.24s

orc completed in 6.93125e+10s

Best arch: generic

--

RUN_VOLK_TESTS: volk_32fc_x2_multiply_32fc_a

generic completed in 52.66s

orc completed in -3.4978e+12s

offset 3 in1: 0.382086 in2: 0.382086

offset 4 in1: 0.496706 in2: 0.496706

offset 8 in1: 0.170967 in2: 0.170967

offset 10 in1: 0.165878 in2: 0.165878

offset 14 in1: 0.398192 in2: 0.398192

offset 15 in1: 0.492358 in2: 0.492358

offset 17 in1: 0.568251 in2: 0.568251

offset 19 in1: 0.0630723 in2: 0.0630723

offset 20 in1: 0.251459 in2: 0.251459

offset 22 in1: 0.348539 in2: 0.348539

volk_32fc_x2_multiply_32fc_a: fail on arch orc

offset 0 in1: 0.140486 in2: 0.140486

offset 1 in1: 0.691375 in2: 0.691375

offset 5 in1: 0.63745 in2: 0.63745

offset 11 in1: 0.644697 in2: 0.644697

offset 14 in1: 0.858205 in2: 0.858205

offset 15 in1: 0.94011 in2: 0.94011

offset 16 in1: 0.490713 in2: 0.490713

offset 18 in1: 0.190573 in2: 0.190573

offset 19 in1: 0.0226408 in2: 0.0226408

offset 20 in1: 0.895774 in2: 0.895774

volk_32fc_x2_multiply_32fc_a: fail on arch orc

offset 1 in1: 0.524585 in2: 0.524585

offset 2 in1: 0.236218 in2: 0.236218

offset 6 in1: 0.733853 in2: 0.733853

offset 9 in1: 0.290247 in2: 0.290247

offset 11 in1: 0.529422 in2: 0.529422

offset 12 in1: 0.180218 in2: 0.180218

offset 14 in1: 0.496568 in2: 0.496568

offset 15 in1: 0.0297472 in2: 0.0297472

offset 19 in1: 0.351138 in2: 0.351138

offset 20 in1: 0.300737 in2: 0.300737

volk_32fc_x2_multiply_32fc_a: fail on arch orc

Best arch: generic

 

Sean

 

From: address@hidden [mailto:address@hidden] On Behalf Of Tom Rondeau
Sent: Thursday, February 23, 2012 11:18 AM


To: Nowlan, Sean
Cc: Nick Foster; address@hidden
Subject: Re: [Discuss-gnuradio] Using volk in Mac: test report

 

On Wed, Feb 22, 2012 at 10:19 PM, Nowlan, Sean <address@hidden> wrote:

I confirmed this works on E100 insofar as I no longer get a segfault on volk_32fc_s32fc_multiple_32fc_a. But volk_32fc_s32f_magnitude_16i_a and volk_32fc_x2_multiply_32fc_a still fail as expected.

 

Sean

 

 

Sean,

I just merged Nick's branch into my safe_align branch. Can you check that one out and test when you get a chance? I just want to make sure we're all on the same branch here. And please post the output of 'ctest -V -R volk'.

 

Thanks!

Tom

 

 

 

From: address@hidden [mailto:address@hidden] On Behalf Of Tom Rondeau
Sent: Tuesday, February 21, 2012 6:49 PM
To: Nick Foster
Cc: Nowlan, Sean; address@hidden


Subject: Re: [Discuss-gnuradio] Using volk in Mac: test report

 

On Tue, Feb 21, 2012 at 6:43 PM, Nick Foster <address@hidden> wrote:

Tom, Sean,

 

There's a couple of things here. First, the Orc volk_32fc_s32f_magnitude_16i_a function is rounding differently than the generic versions on E100 for some reason. Not fatal, totally usable, but it makes the QA code fail. Second, the volk_32fc_x2_multiply_32fc_a looks like it's working fine but the thresholds are too close in the comparison function, which is strange because it uses the same threshold I use everywhere else. I'll keep looking into that. In any case, they're fine for use in Volk as-is.

 

I think the segfault in volk_32fc_s32fc_multiply_32fc_a is being caused by a bug in the profiler code as well. It's not correctly handling complex scalars. The function itself doesn't actually work either, which doesn't help, but it wasn't caught because the profiler code was buggy...

 

Tom, I pushed a fix to my github under "volk_fix". For now I've disabled volk_32fc_s32fc_multiple_32fc_a since I can't figure out a clean way to get it to work under Orc; I had a misunderstanding of how float parameters are handled inside array operations. I also added complex scalar handling. I'll keep looking into solving this one for real but this will get things working for now.

 

--n

 

Nick,

Thanks a ton for working on this. I'll merge your branch asap.

 

Tom

 

 

On Sat, Feb 18, 2012 at 1:05 PM, Tom Rondeau <address@hidden> wrote:

On Fri, Feb 17, 2012 at 6:04 PM, Nowlan, Sean <address@hidden> wrote:

Don’t know how helpful these are, but here you go.

 

Sean

 

Sean,

It looks like a couple of functions are failing from the stdout:

 

volk_32fc_s32f_magnitude_16i_a: fail on arch orc

volk_32fc_x2_multiply_32fc_a: fail on arch orc

 

These are both the Orc implementations of the functions, which seem to work fine on my Intel processors. I don't have access to an OSX box or an E100, so I can't really test this out. The files you sent me don't (appear to) tell me what the real problem is.

 

We'll need some other brave soul out there who can dig into these issues on the platforms for us.

 

Thanks,

Tom

 

 

  

From: address@hidden [mailto:address@hidden] On Behalf Of Tom Rondeau
Sent: Friday, February 17, 2012 5:25 PM
To: Nowlan, Sean
Cc: Nick Foster; address@hidden


Subject: Re: [Discuss-gnuradio] Using volk in Mac: test report

 

On Fri, Feb 17, 2012 at 5:11 PM, Nowlan, Sean <address@hidden> wrote:

I built Tom’s safe_align branch on E100 and ran volk_profile. It segfaulted on “RUN_VOLK_TESTS:volk_32fc_s32fc_multiply_32fc_a. I’ll get a stack trace for you.

 

Sean

 

Really interesting that it's the same block. Hopefully, it's a single, simple fix. I'll look into it when you can get me the stack trace.

 

Thanks for reporting!

Tom

 

 

 

From: discuss-gnuradio-bounces+sean.nowlan=address@hidden [mailto:discuss-gnuradio-bounces+sean.nowlan=address@hidden] On Behalf Of Tom Rondeau
Sent: Friday, February 17, 2012 2:33 PM
To: Nick Foster
Cc: address@hidden
Subject: Re: [Discuss-gnuradio] Using volk in Mac: test report

 

On Fri, Feb 17, 2012 at 2:30 PM, Nick Foster <address@hidden> wrote:

On Fri, Feb 17, 2012 at 11:20 AM, Carles Fernandez <address@hidden> wrote:

Thanks for the inputs!

We are interested in determining the best architecture at instantation
time. What would be the best strategy? We though about running the
same operations several times for each architecture, measure the
results and use the fastest one for the processing blocks. Would this
be the right approach?

 

Carles,

 

Run volk_profile. It does exactly what you said, and writes the results to ~/.volk/volk_config. Volk reads this file when it is involked (sorry) to determine which particular function to execute. So all you do is run volk_profile once on any given machine, and it's optimized.

 

--n

 

Carles,

This is discussed on the webpage:

 

We'll be updating this as things progress with Volk, but the profiler info is there already.

 

Tom

 

 

 

 

 

 





reply via email to

[Prev in Thread] Current Thread [Next in Thread]