bug-gnustep
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Locking bug in NSMessagePortWin32


From: Richard Frith-Macdonald
Subject: Re: Locking bug in NSMessagePortWin32
Date: Sat, 9 Sep 2006 18:01:39 +0100


On 5 Sep 2006, at 13:43, Wim Oudshoorn wrote:

Debugging under windows is a little tricky, but in our applicaton I observe
the following deadlock:

Thread 8:

NSMessagePort _setupSendPort line 145 self = 0x19a5770 Block on Lock: this->lock NSMessagePort newWithName line 208 Grabs lock: messagePortLock
       ...
NSMessagePort receivedEventRead line 638 self = 0x2a45c90



Thread 1:

NSMessagePort newWithName line 200 Block on lock: messagePortLock
       ...
NSMessagePort receivedEventRead line 638 self = 0x19a5770 Grabs lock: 0x19a5770->lock


Consequence:  DEADLOCK!


So here is a scenario how we can end up in this situation.

1 - Thread 8 sends a message to thread 1.
2 - Thread 1 replies to thread 8
3 - Thread X sends a message to thread 1.
4 - Thread 1 handles starts handling the message from Thread X and grabs
    the 0x19a5770->lock

5 - Thread 8 starts handling the reply of thread 1
6 - Thread 8 reads the send port of the reply and tries to
    get the port that was used to send the reply.
    For this it calls newWithName.

7 - Thread 8 grabs the messsagePortLock in newWithName
8 - thread 8 calls _setupSendPort on the messageport 0x19a5770 which was used for sending 9 - Thread 8 tries to grab 0x19a5770->lock but fails (hold by thread 1 in sterp 4)

10 - Thread 1 continues and wants to deduce the port that thread X used for sending, 11 - Thread 1 calls newWithName and blocks on messsPortLock (hold by thread 8 in step 7)


So an obvious fix is to try to make the locks non nesting in
newPortName:  and initWithName:.

But:

A - I don't know if that is wrong

Seems plausible though.

B - I don't know if it is enough to fix the problem

I'm not sure either ... but there is no obvious way that this would happen if the call to _setupSendPort is moved outside the region protected by the messagePortLock ... so I've restructured the code that way.

C - I just have this nagging feeling that _setupSendPort is
    useless anyway.  Why is it called on a port that already exists?

I think, because the port may exist only for receiving and need to be set up for sending too.

This code also suffered from the bug that we could potentially get double deallocation of a port if one thread searched the table and found it while another thread was performing a final release on it. I've added an implementation of -release which should fix that.

We need to review all the places where objects are 'uniqued' in a global table but are not permanently cached ... they probably all suffer from the same problem and need fixing.





reply via email to

[Prev in Thread] Current Thread [Next in Thread]