Re: [Discuss-gnuradio] Segfault with volk on 32 bit AMD
From:
Frederick Stevens
Subject:
Re: [Discuss-gnuradio] Segfault with volk on 32 bit AMD
Date:
Tue, 20 Mar 2012 13:29:32 -0500
User-agent:
Mozilla/5.0 (X11; Linux i686; rv:11.0) Gecko/20120314 Thunderbird/11.0
Ton & Nick,
Let me try this on the other AMD 32 bit machine. This is a Tyan
S2460 dual processor AMD motherboard which has it's own quirks and
has a tendency to not like some things. I have it running fairly
well without linux kernel oops and such. I will also check my
syslogs and dmesg for more info. If things run fine without the
patch on the other AMD machine then I will consider it a specific
motherboard issue and not worry about it.
Cheers,
Fred
On 03/20/2012 01:24 PM, Nick Foster wrote:
On Tue, Mar 20, 2012 at 10:59 AM, Tom
Rondeau <address@hidden>
wrote:
On Tue, Mar 20, 2012 at 11:03 AM, Frederick
Stevens <address@hidden>
wrote:
Tom, et. al.
Here is the output from the volk_profile run (see
attached).
Cheers,
Fred
Well, Fred, that output looks good. Everything's
showing up as it should. Interesting that it passed this
time, but I half expected it to. It seems like there's a
memory allocation problem going on, since when it crashes,
it did so just a bit after getting half way through, must
have been when it hit something else allocated. Very odd
behavior that I've seen on occasion, but they way the
memory is allocated in the volk_qa_aligned_mem_pool, I
wouldn't expect there to be a problem.
Right now, I'm at a loss on how to proceed. I
can't think of any more really useful tests. Were I able
to reproduce this error on one of my machines, I'd just
have to start tinkering around and getting more output
data.
If volk_qa_aligned_mem_pool were passed the wrong "type"
argument due to a parser error, it could allocate half the
required memory. But I'd expect to see significantly more
catastrophic failures across many more machines & tests if
this were the case.
--n
Tom
On 03/20/2012 09:47 AM, Marcus D. Leech
wrote:
On 03/20/2012 10:42 AM,
Tom Rondeau wrote:
Fred,
Thanks. Can you get the entire output
(in a text file)? There's some
information that's printed at the top
that's important. Just run it from the
command-line and pipe (>) the output
into a file.
<pedantic>
Just because I'm a grumpy old Unix guy from
waaaaay back, I'll point out that the term
"pipe" is very frequently mis-used to mean
"redirect", when in fact, the pipe symbol in
the Unix shell is "|" and is a mechanism for
attaching the standard output of one program
to the standard input of another. The
">" symbol means "redirect the standard
output to a file", which is similar, but not
the same as,
the use of a "pipe", which is an IPC
mechanism.
</pedantic>
Oh, and that trailing whitespace
warning shouldn't be a problem. The
patch should have still be applied.
Thanks,
Tom
On 03/20/2012 08:49 AM, Tom
Rondeau wrote:
On Mon,
Mar 19, 2012 at 4:49 PM,
Frederick Stevens <address@hidden>
wrote:
Tom,
New run using my simple
"trace" See attached
files.
Cheers,
Fred
Fred,
A good start. It's only
going through half of the
data it's supposed before
seg faulting, so it's like
one of the buffers (probably
the bPtr buffer to the 32f
input) isn't getting
allocated properly.
I've attached a patch
that only tests this kernel
so no other outputs will
confuse things and I've
shortened the run length
(single iteration, fewer
samples). This now spits out
the data used to generate
the input and output
buffers. It also outputs the
size of the data types in
the test instead of the
pointer size.
if you're unfamiliar with
working with patches, just
reset your git tree (git
reset --hard, unless you
have some changes you need
to / want to keep) and apply
this (git apply
location/volk_slackware32.diff).
I suggest the reset so there
aren't any conflicts or
problems when applying.
Thanks,
Tom
On 03/19/2012
11:26 AM, Tom Rondeau
wrote:
On
Mon, Mar 19, 2012
at 12:04 PM,
Frederick Stevens
<address@hidden>
wrote:
Tom,
See the
attached
file. I am
running
volk_profile
now. If this
is what you
need then that
is great
otherwise I
will keep
working on
this with
whatever
suggestions
you have.
Cheers,
Fred
That'll be a
good start.
We'll see if
that tells us
anything.
Thanks,
Tom
On
03/19/2012
08:10 AM, Tom
Rondeau wrote:
On
Sun, Mar 18,
2012 at 8:00
PM, Frederick
Stevens <address@hidden>
wrote:
Volk_profile ran to completion. I am using the git
source tree
updated just
before I did
the run. I
commented out
line 38 of
volk_profile.cc
as you
suggested and
ran
volk_profile
under gdb.
The output is
in the
attached text
file. I have
also attached
the generated
volk_config
from
~/.volk/volk_config.
Thanks.
Strange that
it's just that
kernel, then.
Can you put in
some debug
lines that
will print out
the size of
the buffers
being used and
the 'number'
variable in
volk_32fc_x2_multiply_32fc_a
when the crash
occurs. I just
want to see if
the loop is
trying to go
beyond the
bounds of the
arrays.
I noted from running gnuradio-companion version
3.5.1, (which
works) that
when I use a
multiply
block, this
message from
python is
generated:
./top_block.py
>>>
gr_fir_fff:
using 3DNow!
but
volk_profile
does not seem
to recognize
the 3DNow!
processor
extensions
(produces sse2
and sse3
messages on
the Intel Atom
32 bit
machine).
Yeah,
that's fine.
Without a
3DNow! kernel,
Volk will just
fall back on
the generic
implementation.
The thought
being that the
generic
version will
work for
everyone. So
we need to
figure out why
that's not
true for
your...
Hope this helps! Let me know if you want me to try
anything
else. I'll
let you know
how things
turn out on
the other
machine as
well.
Cheers,
Fred
Thanks.
Tom
On 03/18/2012
04:31 PM, Tom
Rondeau wrote:
On
Fri, Mar 16,
2012 at 6:11
PM, Frederick
Stevens <address@hidden>
wrote:
Well, after a few restarts, here is my output. I did
a fresh pull
from git
because I was
getting some
errors with
missing *.h
files in
gruel/src/swig
or something
like that.
Hope this
helps!
RUN_VOLK_TESTS:
volk_32fc_32f_multiply_32fc_a
Program
received
signal
SIGSEGV,
Segmentation
fault.
0xb7edbb74 in
volk_32fc_32f_multiply_32fc_a_generic
(cVector=0xb7448008,
aVector=0xb7768008,
bVector=0xb78f8008,
num_points=204600)
at
/home/fred/extras/gnuradio/gnuradio/volk/include/volk/volk_32fc_32f_multiply_32fc_a.h:74
74
*cPtr++ =
(*aPtr++) *
(*bPtr++);
(gdb) bt
#0 0xb7edbb74
in
volk_32fc_32f_multiply_32fc_a_generic
(cVector=0xb7448008,
aVector=0xb7768008,
bVector=0xb78f8008,
num_points=204600)
at
/home/fred/extras/gnuradio/gnuradio/volk/include/volk/volk_32fc_32f_multiply_32fc_a.h:74
Alright,
Fred,
definitely
something
strange going
on here. My
only guess is
that for some
reason on your
architecture/OS/whatever,
something is
being handled
incorrectly
and the
buffers a, b,
and c are not
getting
generated
correctly,
maybe
something like
it's not
doubling the
number of
items for the
complex data
type (before
this function
test, there
are 16ic, or
complex
shorts, being
tested, but
this is the
first complex
float test).
It's hard
to tell if
it's something
about it being
an AMD chip,
32-bit,
Slackware
version, gcc
version, etc.
And I don't
have an AMD
chip to test
on, but I
could load up
a 32-bit
Slackware VM
at least.
How much
work are you
willing to put
into this to
help us nail
this down?
If you
can follow
through the
volk_profile
test code, we
can start
outputting
more debug
info. To start
with, I'd
suggest going
into
volk/apps/volk_profile.cc
and commenting
out line 38,
rebuild the
application,
and run this
new
volk_profile
to see if it
fails on any
other kernels.