[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

GSIMapUnlinkNodeFromMap doesn't scale

From: James Knight
Subject: GSIMapUnlinkNodeFromMap doesn't scale
Date: Mon, 27 May 2002 21:44:56 -0400

When using a hashtable, linear behavior is extremely undesirable. However, GSIMapUnlinkNodeFromMap does just that - it loops over the whole map in order to find the node to unlink from the list of elements in the map.

In a program I'm testing, GSIMapUnlinkNodeFromMap is taking about 70% of the execution time. It has a map of around 100000 items and removes 10000 of them, and takes 40sec to do it (on a reasonably fast machine). That operation should not be incredibly stressful, but, because of the extra list of elements kept around, it takes forever. I don't see any purpose behind keeping that extra linked-list of all elements around, in any case. It seems like both a waste of memory, and a waste of time.

The only thing it is used for is enumeration, and enumeration works just fine by enumerating over each item in each bucket. The only thing wrong with doing it this way is Bad Things Will Happen if you add an item to the map which causes it to be resized while enumerating. However, proper operation under those circumstances is explicitly disclaimed in the API spec, so I see little reason to support it. Reasonably fast deletion of items is more important than supporting an unspecified behavior.

I suspect the reason someone put that in however many years ago was in a misguided attempt to guarantee ordering of elements. (IE: new elements always appear first when enumerating). This behavior is not necessary - if the user wants to guarantee ordering, he should keep a seperate array.

I made up a patch which removes the extra linked list and therefore fixes the speed problem. Most of the changes are in GSIMap.h. There are minor changes in other files to make them use the iteration functions where they had been directly iterating over the linked list themselves.


It appears to work correctly. However - It adds a field to NSHashEnumeration and NSMapEnumeration. They now have 3 fields, which is in line with what the Apple API spec says, but not with what they previously were. This is a problem for binary compatibility (if anyone actually uses the NSMapTable/NSHashTable functions). If binary compatibility is necessary for unstable versions, a workaround could be coded to use one of the fields in the two-field version as a pointer to the real data which is allocated in NSEnumerateHashTable and freed in NSEndHashTableEnumeration.


PS: there are other patches on <http://languor.static.net/~jknight/gnustep-patches/> that you might also want to apply. I've mentioned these problems in prior emails, but I'll just mention them again:

- FastMallocBuffer-bug.patch gets rid of FastMallocBuffer when GC is enabled, as it is not necessary, and the implementation calls into the autorelease pool directly, which causes problems in GC, as the autorelease pools are not used. - NSData-gc-bug.patch fixes a couple of trivial naming bugs which cause NSData to not even compile with GC enabled. - GC-release.patch fixes a few of the places I've found where memory isn't zeroed in a collection after releasing an object it contains, thus creating problems with objects staying around longer than their intended lifetime when GC is enabled.


reply via email to

[Prev in Thread] Current Thread [Next in Thread]