[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Locks and threads
Re: Locks and threads
Wed, 11 Feb 2009 23:53:23 +0000
Gnus/5.11 (Gnus v5.11) Emacs/22.2 (gnu/linux)
Linas Vepstas <address@hidden> writes:
> Err, sort of, yes, unless I misunderstand. Guile 1.8 makes
> a certain basic assumption that is splattered throughout
> the code; it rather intentionally re-orders the order in which
> one of the locks is taken. If I remember correctly, its the
> "in guile mode" lock. So if you just go looking for locks
> that are released out-of-order, you'll find lots of these.
Yes, I think I understand this now (having seen it myself). The
- thread holding its heap_mutex - which is the normal state in guile
- thread calls scm_i_scm_pthread_mutex_lock to lock some other mutex
- unlocks the heap_mutex
- locks the other mutex
- locks the heap mutex again.
That in itself doesn't actually cause an ordering problem, but then
the thread releases the other mutex without releasing the heap mutex
first - which is perceived (by helgrind at least) as a problem.
(Is something like this actually _ever_ a problem? If locks are
always _acquired_ in the right order, how can the order of _releasing_
ever cause a problem?)
The async_mutex handling (that I've posted a patch for) is one example
> At the time, I had decided that
> 1) it would be a lot of work to get these in order, and the
> patch would likely be rejected, and
> 2) the reordering is essentially harmless (since its
> consistently done).
> 3) there might have even been a performance hit (I don't
> remember) by trying to get these into order.
The other thing to bear in mind is that 99% of this will just
evaporate if we move to BDW-GC for 2.0.x; so - assuming we do end up
doing that - it makes sense to take a slightly more pragmatic approach
than normal for 1.8.x.
> This made using valgrind impossible, and that's why I created
> the custom patch -- it made a point of ignoring this one
> reversed-order, while checking for badness in everything else.
Thanks. I understand this much better now! On the other hand, after
the async_mutex patch, my helgrind output  is only reporting a
couple of problems now, so it looks like helgrind-cleanliness might be
 I am only running a basic startup test, though: "valgrind
--tool=helgrind guile -q <<EOF". Were you running something a lot
more complex than that?