|
From: | Wolfgang Lux |
Subject: | Re: Please test new NSLock implementation! |
Date: | Fri, 4 Sep 2009 09:29:01 +0200 |
Fred Kiefer wrote:
The old version used objc_mutex_t, which was a void*. A mutex is typically either one or two words, depending on the implementation. Using malloc for this is very wasteful, both in terms of speed, cacheusage, and memory footprint. The objc_mutex_alloc() function was doingmalloc(sizeof(pthread_mutex_t)); on some platforms (e.g. FreeBSD)pthread_mutex_t is a pointer to some other structure, so we were tyingup three cache lines for a two word data structure. This is far from ideal.Not sure if I getting this correctly but on my 64-bit Linux system I have # define __SIZEOF_PTHREAD_MUTEX_T 40 and 24 for a 32-bit system. We are rather talking about 12 to 20 words here not one or two.
Indeed your are getting this correct. Using the test program #include <pthread.h> int main() { pthread_mutex_t mutex; printf("sizeof(mutex) = %d\n", sizeof(mutex)); return 0; } I've collected the size of a pthread mutex on a few x86 platforms: OS X 10.5 (x86): 44 Solaris 10 (x86): 24 NetBSD 5.0 (x86): 28 OpenBSD 4.5 (x86): 4 FreeBSD 7.2 (x86): 4 OpenSUSE 11.1 (x86): 24 So it looks like David is attempting to optimize code for his platform while making things worse for everybody else :-(.
What do others think, is it worthwhile to hide these implementationdetails via another indirection or not? I am still in favour of using an opaque data type here. On systems like FreeBSD where pthread_mutex_t is itself a pointer we could use that directly and on other systems we haveone additional malloc and free call per mutex. And where the structure fits into your one or two words we could even put the value into the ivar directly.
I absolutely agree with you here. The interface should not expose any of the implementation details unless there is a really pressing need to do so. I understand David's reasoning that the approach with an opaque pointer may lead to additional cache misses on FreeBSD (and apparently OpenBSD as well), but without hard figures -- i.e., benchmarks for real world programs making intensive use of NSLocks that show a substantial performance improvement -- I consider this kind of coding premature optimization which should be avoided. Wolfgang
[Prev in Thread] | Current Thread | [Next in Thread] |