[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Threads, DO and Runloop unexpected behaviour
From: |
Richard Frith-Macdonald |
Subject: |
Re: Threads, DO and Runloop unexpected behaviour |
Date: |
Tue, 1 Nov 2005 20:41:32 +0000 |
On 1 Nov 2005, at 17:38, Richard Frith-Macdonald wrote:
On 1 Nov 2005, at 16:59, Wim Oudshoorn wrote:
No, the -doExternal method will run the runloop again, so it is
available
for handling the new -doInternal requests. Mostly this works fine
and while -doExternal is executing it will process -doInternal.
However
once in a while it will fail.
This is unfortunate because in our, more complex case, this will
lead to a deadlock.
Ah I see .. the problem is not 'how is it possible to get two
doExternal messages logged together',
it's that sometimes you get no doInternal requests inside a
doExternal, and you think this means that you will never get
doInternal's (rather than just being an artifact of the
scheduler). Seems plausible ... in which case we have a bug. I'll
modify the code to produce something that will actually deadlock to
prove that doInternal *never* gets called.
I modified -doInternal to increment a global variable, and -
doExternal to repeatedly run the runloop as logn as the variable was
unchanged from its value on entry ... thus forcing the program to
hang the first time -doExternal was entered and not accepting
requests for -doInternal.
It hung very quickly ... a convincing demonstration of the bug.
Turning on lots of debug in NSConnection, and running with the
GSLogThread user default set to YES (to include the thread ID in NSLog
() output), it was quite straightforward (though tedious) to plough
through the copious logs, compare them with the source code, and see
what was happening.
So, I can see what the bug is, and there is a workaround for your code.
The problem happens when the system is sending back a reply to a DO
message ... it writes that reply running the runloop in
NSConnectionReplyMode. Now, in order to allow callbacks over DO, the
connection allows incoming requests while it is in
NSConnectionReplyMode ... so it receives an incoming request for a -
doExternal before it manages to write the reply to the preceeding -
doInternal. This means that the thread (A) hasn't received a reply
to one request, so it can't start sending another.
When -doExternal runs the runloop, it does so in
NSDefaultRunLoopMode ... but the write of the reply was not done in
that mode, so even though it needs to be sent, it won't be processed
by the runloop ... so the reply*never* goes back to thread A as long
as we are processing the -doExternal method, but as soon as -
doExternal completes (after 20 0.005 second runs of the loop in your
original code) control returns to the runloop in
NSConnectionReplyMode, the reply to the -dointernal is written, and
thread A can then send another request.
The workaround for your code is to have -doExternal run the loop in
NSConnectionReplyMode rather than in NSDefaultRunLoopMode of course.
However, I've cleaned up the port code a little in CVS (a lot still
to do) and incorporated the best fix I could think of ... to put
ports in both NSConnectionReplyMode AND NSDefaultRunLoopMode when
sending a reply. Ideally I guess the reply write should be sent out
whenever the runloop is run, irrespective of the mode it's in ... but
there is no mechanism to do that, so the only way I can think of
would be to hack in a private api to NSRunLoop to do it ... I'd
rather not do that.