[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Capacity of NSDictionary
From: |
Richard Frith-Macdonald |
Subject: |
Re: Capacity of NSDictionary |
Date: |
Fri, 1 Feb 2008 18:39:39 +0000 |
On 1 Feb 2008, at 17:45, Andreas Höschler wrote:
Hi all,
I am using a dictionary to hold business objects. As the key I use a
class SOObjectID which basically looks as follows:
@interface SOObjectID : NSObject < NSCopying, SRArchivingProtocol >
{
int _identifier; // entity
int _primaryKey;
}
- (unsigned int)hash
{
return (_primaryKey * 1000 + _identifier);
}
- (BOOL)isEqual:(id)object
{
if (self == object) return YES;
if ([object isKindOfClass:[SOObjectID class]])
{
return ([(SOObjectID *)object identifier] == _identifier &&
[(SOObjectID *)object primaryKey] == _primaryKey);
}
else return NO;
}
I am hitting a serious performance problem when holding many objects
in the dictionary. Currently I am running a test with 3 GB of data
loaded into the process (running under Solaris 10) and it's getting
slow as hell. The machine has 16 GB of RAM so swapping is not an
issue. I know that the above is not the cutest way to hold great
chunks of data. I already wondered whether it would be a good idea
to replace the NSDictionary with something that uses a binary tree
to hold the data. However, since this is not done in a couple of
minutes I would appreciate to find an easier solution for now. How
many objects should a dictionary be able to handle with reasonable
performance? Anything I could do in my SOGlobalID class to improve
performance?
To the best of my knowledge, a dictionary should hold a truly huge
number of objects without any performance problems ... as long as the
keys it holds have a good -hash. An ideal hash should not only be
different for each object but should also have a random distribution
so that when part of it is used to select a bucket in the hash table,
you spread your lookups evenly over all the buckets.
You could try something like (untested code just hacked in):
- (unsigned) hash
{
int l = sizeof(int)*2;
unsigned char buf[l];
unsigned val = 0;
// If you use a union you may be able to get better perfomance with
an assignment than using memcpy() here.
memcpy(buf, &_primaryKey, sizeof(int));
memcpy(buf + sizeof(int), &_identifier, sizeof(int));
while (l-- > 0)
{
val = (val << 5) + val + buf[l];
}
return val;
}
When I was looking at hash functions for NSString several years ago,
this shifting and adding scheme was the best general purpose one I
could find.