discuss-gnustep
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Capacity of NSDictionary


From: Richard Frith-Macdonald
Subject: Re: Capacity of NSDictionary
Date: Fri, 1 Feb 2008 18:39:39 +0000


On 1 Feb 2008, at 17:45, Andreas Höschler wrote:

Hi all,

I am using a dictionary to hold business objects. As the key I use a class SOObjectID which basically looks as follows:

@interface SOObjectID : NSObject < NSCopying, SRArchivingProtocol >
{
  int _identifier; // entity
  int _primaryKey;
}

- (unsigned int)hash
{
  return (_primaryKey * 1000 + _identifier);
}

- (BOOL)isEqual:(id)object
{
  if (self == object) return YES;
  if ([object isKindOfClass:[SOObjectID class]])
    {
return ([(SOObjectID *)object identifier] == _identifier && [(SOObjectID *)object primaryKey] == _primaryKey);
    }
  else return NO;
}

I am hitting a serious performance problem when holding many objects in the dictionary. Currently I am running a test with 3 GB of data loaded into the process (running under Solaris 10) and it's getting slow as hell. The machine has 16 GB of RAM so swapping is not an issue. I know that the above is not the cutest way to hold great chunks of data. I already wondered whether it would be a good idea to replace the NSDictionary with something that uses a binary tree to hold the data. However, since this is not done in a couple of minutes I would appreciate to find an easier solution for now. How many objects should a dictionary be able to handle with reasonable performance? Anything I could do in my SOGlobalID class to improve performance?

To the best of my knowledge, a dictionary should hold a truly huge number of objects without any performance problems ... as long as the keys it holds have a good -hash. An ideal hash should not only be different for each object but should also have a random distribution so that when part of it is used to select a bucket in the hash table, you spread your lookups evenly over all the buckets.

You could try something like (untested code just hacked in):

- (unsigned) hash
{
  int l = sizeof(int)*2;
  unsigned char buf[l];
  unsigned val = 0;

// If you use a union you may be able to get better perfomance with an assignment than using memcpy() here.
  memcpy(buf, &_primaryKey, sizeof(int));
  memcpy(buf + sizeof(int), &_identifier, sizeof(int));
  while (l-- > 0)
    {
      val = (val << 5) + val + buf[l];
    }
  return val;
}

When I was looking at hash functions for NSString several years ago, this shifting and adding scheme was the best general purpose one I could find.





reply via email to

[Prev in Thread] Current Thread [Next in Thread]