[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Locks and threads
Re: Locks and threads
Wed, 11 Feb 2009 18:18:39 -0600
2009/2/11 Neil Jerram <address@hidden>:
> Linas Vepstas <address@hidden> writes:
>> Err, sort of, yes, unless I misunderstand. Guile 1.8 makes
>> a certain basic assumption that is splattered throughout
>> the code; it rather intentionally re-orders the order in which
>> one of the locks is taken. If I remember correctly, its the
>> "in guile mode" lock. So if you just go looking for locks
>> that are released out-of-order, you'll find lots of these.
> Yes, I think I understand this now (having seen it myself). The
> pattern is
> - thread holding its heap_mutex - which is the normal state in guile
Yes, that's the one.
> That in itself doesn't actually cause an ordering problem, but then
> the thread releases the other mutex without releasing the heap mutex
> first - which is perceived (by helgrind at least) as a problem.
> (Is something like this actually _ever_ a problem? If locks are
> always _acquired_ in the right order, how can the order of _releasing_
> ever cause a problem?)
Yes, it can be a problem; I don't want to dream up
some particular scenario (this stuff destroys brain
cells) ... but I do vaguely remember skimming some
wikipedia article on locking that discussed this.
I think the scenario involves three locks, though.
This is why helgrind checks lock order -- its one of
the locking problems it can actually detect. However,
for the case of guile, the heap mutex is not visible to
anything that isn't guile, and thus, its safe in this
particular case. If there were outside users, things
would be different.
> The other thing to bear in mind is that 99% of this will just
> evaporate if we move to BDW-GC for 2.0.x; so - assuming we do end up
> doing that - it makes sense to take a slightly more pragmatic approach
> than normal for 1.8.x.
Sure. As a reminder ... the only real remaining problem
that remained with a race to update some hash table,
when define was being used from several threads.
*thats* the bug that needs attention (but is hard to fix).
>  I am only running a basic startup test, though: "valgrind
> --tool=helgrind guile -q <<EOF". Were you running something a lot
> more complex than that?
I had written some simple test case, which I think sprouted
a bunch of threads, and then did simple scheme things
in each .. e.g. just adding numbers, or whatever. I'm
attaching some kind of simple test case to this email
however, it is very hacked, so I don't know if it actually
will find bugs, and its probably doesn't do what it
claims to do. I provide it only as a short-cut for creating
a new test case. ...
Description: Text Data