[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: deadlock in scm_join_thread(_timed)

From: Neil Jerram
Subject: Re: deadlock in scm_join_thread(_timed)
Date: Sun, 25 May 2008 14:16:07 +0100

2008/5/25 Julian Graham <address@hidden>:
Hi everyone,

While I was testing and debugging some of the SRFI-18 code that Neil
and I were working on, I noticed a deadlock that happens in
scm_join_thread_timed.  I'm pretty sure it affects the 1.8 codebase as
well, although it's probably more common when doing timed joins.

Thread joining in Guile (1.9 or 1.8) works as follows:

1. If the target thread has exited, return.
2. Block on the target thread's join queue.
3. When woken (because of a pthread_cond_signal, a spurious pthreads
wakeup, or, in 1.9, a timeout expiration), check the target thread's
exit status -- if it has exited, return.
4. Otherwise, SCM_TICK.
5. Go to step 2.

The deadlock can happen if the thread exits during the tick, because
there's no check of the exit status before block_self is called again.
 I'm pretty sure that moving step 1 into the beginning of the loop
would fix this --  I can submit a patch against 1.8, 1.9, or both.
Let me know what you guys would like.

Hi Julian,

Based on the synopsis above, I agree that moving step 1 inside the loop should fix this.  In addition, though, I think it would be very good if we could add a minimal test that currently reproduces the deadlock, and so will serve to guard against future regressions here.  Do you have such a test?

No need for a patch against both 1.8 and 1.9; just one will do, and git cherry-pick will handle the other for us (unless the fix is significantly different in the two branches).



reply via email to

[Prev in Thread] Current Thread [Next in Thread]