[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: The 2.0.9 VM cores in enqueue (threads.c:309) -- partial fix, patch

From: Andrew Gaylard
Subject: Re: The 2.0.9 VM cores in enqueue (threads.c:309) -- partial fix, patch attached
Date: Mon, 17 Jun 2013 12:06:10 +0200
User-agent: Mozilla/5.0 (X11; SunOS sun4u; rv:17.0) Gecko/20130510 Thunderbird/17.0.6

On 04/29/13 12:10, Mark H Weaver wrote:
Hi Andrew,
On 28 April 2013 03:57, Andrew Gaylard <address@hidden> wrote:
Those 0x304 values look dodgy to me, and explain why the
SCM_SETCDR causes an invalid memory access.
(gdb) p *SCM2PTR(q)
$26 = {word_0 = 0x304, word_1 = 0x1039c4c20}
What's happening here is that the wait queue (m->waiting in fat_mutex)
is somehow getting corrupted.  The code above ('enqueue' in threads.c)
is trying to add a new element to the queue.  The queue is represented
as a pair whose CDR is the list of items in the queue, and whose CAR
points to the last pair of that list.  Somehow, the CAR is becoming null
even though the CDR is non-empty.  This should never happen.

I looked through the relevant code, and it's not obvious to me how this
could happen.  The only functions I see that manipulate this queue are
'enqueue', 'remqueue', and 'dequeue', all static functions in threads.c.
As far as I can see, these functions maintain the invariant that the CAR
is null if and only if the CDR is null.  All queue manipulation is done
in async.h) which lock a single global pthread mutex.

Any ideas?


I've had some more time to look into this problem, and now have a partial fix.

The problem does not occur on Linux x86 or x86_64 (Ubuntu-12.04).
The problem always occurs on Solaris-10u9, both x86_64 and SPARC.

The problem is always the segmentation fault trying to write to write
to 0x30c, at threads.c:309.  Inspection of the remqueue function shows
that the logic is not correct when removing the last entry in the queue.

The patch attached helps -- my code runs for much longer, but doesn't crash.
However it now hangs somewhere else (which may be an unrelated problem).

I'd be grateful for any feedback.

Attachment: fix-guile-thread-remqueue.patch
Description: Text Data

reply via email to

[Prev in Thread] Current Thread [Next in Thread]