[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Please test new NSLock implementation!

From: Richard Frith-Macdonald
Subject: Re: Please test new NSLock implementation!
Date: Sun, 6 Sep 2009 17:56:34 +0100

On 4 Sep 2009, at 08:29, Wolfgang Lux wrote:

Fred Kiefer wrote:

The old version used objc_mutex_t, which was a void*.  A mutex is
typically either one or two words, depending on the implementation.
Using malloc for this is very wasteful, both in terms of speed, cache usage, and memory footprint. The objc_mutex_alloc() function was doing
malloc(sizeof(pthread_mutex_t)); on some platforms (e.g. FreeBSD)
pthread_mutex_t is a pointer to some other structure, so we were tying up three cache lines for a two word data structure. This is far from

Not sure if I getting this correctly but on my 64-bit Linux system I
have # define __SIZEOF_PTHREAD_MUTEX_T 40 and 24 for a 32-bit system.
We are rather talking about 12 to 20 words here not one or two.

Indeed your are getting this correct. Using the test program
#include <pthread.h>
int main()
 pthread_mutex_t mutex;
 printf("sizeof(mutex) = %d\n", sizeof(mutex));
 return 0;
I've collected the size of a pthread mutex on a few x86 platforms:
OS X 10.5 (x86): 44
Solaris 10 (x86): 24
NetBSD 5.0 (x86): 28
OpenBSD 4.5 (x86): 4
FreeBSD 7.2 (x86): 4
OpenSUSE 11.1 (x86): 24
So it looks like David is attempting to optimize code for his platform
while making things worse for everybody else :-(.

Fred was IMO completely correct to point out that putting pthread.h and types declared in it in NSLock.h exposes internal workings ... and we have a policy of not doing that as a general principle. However, I think it's an unjustified jump to saying this is making things worse for everyone else ... you need to look at the practical effects of the change. ... se below.

What do others think, is it worthwhile to hide these implementation
details via another indirection or not? I am still in favour of using an opaque data type here. On systems like FreeBSD where pthread_mutex_t is itself a pointer we could use that directly and on other systems we have one additional malloc and free call per mutex. And where the structure
fits into your one or two words we could even put the value into the
ivar directly.

I absolutely agree with you here. The interface should not expose any of the implementation details unless there is a really pressing need to do
so. I understand David's reasoning that the approach with an opaque
pointer may lead to additional cache misses on FreeBSD (and apparently
OpenBSD as well), but without hard figures -- i.e., benchmarks for real
world programs making intensive use of NSLocks that show a substantial
performance improvement -- I consider this kind of coding premature
optimization which should be avoided.

OK ... I agree with this principle, but look at why ...

1. we don't want to expose internal workings because we don't want developers to start to depend on those internals in such a way that it's hard for us to change things later without breaking existing applications. 2. we don't want to expose internal workings because changes to them may break API and mean that apps need to be recompiled.

Issue 1 is real, but we can't quantify how important it is as it's a amatter of perceptions rather than a true technical issue. Luckily it's quite easily largely fixable, simply by removing pthread.h form the header and replacing the types from pthread.h with opaque dummy types of the same size, so that there is no temptation for developers to use them directly. So I did that, though really, I'm not sure that was worth the bother, since the ivars concerned were already clearly marked as private. I guess it's just good to do it to avoid having the extra symbols polluting developers namespaces.

Issue 2 is a technical problem ... if someone subclasses one of the lock classes, their compiled code is obviously dependent on the size of the superclass and if that size changes (eg due to using a different pthread implementation) then the size of the superclass would change even though the version of the base library is unchanged. So potentially an app linked with one copy of the base library would fail to run properly with another copy of the library even though it was the same version!

That sounds really bad, until you think about the chances of it actually happening ... you would have to have two versions of the pthread library, with different sized pthread_mutext_t types available on the same system. You would have to have apps which are subclassing NSLock or NSRecursiveLock or NSCondition. You would have to have built those apps with a base library built using one version of the pthread library, and then you would have to be running with a base library built with the other version of the pthread library. In reality, I don't expect that would ever happen (unless perhaps, a developer is playing with thread library development themselves ... in which case they are unlikely to have any trouble spotting/resolving the issue on their own).

Balanced against that is the modest complexity of allocating the mutex using malloc and freeing it when the NSLock is deallocated, and the performance hit of doing that, and of performing an extra indirection on every locking operation. That's a small performance penalty ... but probably worth avoiding when you consider that locking is hugely heavily used in any threaded application.

The trade-off (source code simplicity and improved performance against ABI stability) is difficult to call, but I think David made the right design decision here.

reply via email to

[Prev in Thread] Current Thread [Next in Thread]