[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Hashing, and NSDictionaries
From: |
Richard Frith-Macdonald |
Subject: |
Re: Hashing, and NSDictionaries |
Date: |
Tue, 12 Mar 2002 19:48:46 +0000 |
On Tuesday, March 12, 2002, at 07:14 PM, Stephen Brandon wrote:
Hi,
Just a query about the way NSDictionaries do their lookups.
It seems that although NSDictionary creates a hash of all keys handed
to it,
once it finds a match between a stored hash and the hash of a
(possible) key
sent to it, it then does an isEqual: between the stored key and the
query key.
Why is this final equality check necessary? Isn't the fact that it
already
has a hash of the stored key and the query key, and the fact that they
match,
good enough?
Only if you want to get the wrong values from the dictionary :-)
-hash is a many to one operation ...
For instance, there are many possible strings with the same hash value
(if it wasn't
for the limit on the length of strings, there would be an infinite
number).
Even if the objects you are storing were guaranteed to have a one to one
relationship
with their hash values, the dictionaries would still need to use
isEqual: because
they do *NOT* find 'a match between a stored hash and the hash of a
(possible) key'.
They use the hash of the (possible) key to determine which 'bucket' a
corresponding
stored key-value pair must be (if present). That bucket may contain many
key-value
pairs (though the software tries to keep the number of entries per
bucket to a
minimum).
> In my situation, my -isEqual does a new hash of self and the other
object and
compares them -- so the number of hashes done can be very large! And
there's
quite a lot of redundant hashing going on.
Well, that depends on your -isEqual: implementation ... try implementing
it without
hashing if the hashing is a large overhead.
GNUstep-base uses -hash in -isEqualToString: - but that's because the
concrete
subclasses of NSString cache their hash values and don't need to
recompute them,
and because we expect isEqualToString: to be used in many places other
than
dictionaries and sets, so performance is designed to be good overall.
If you are writing an app which makes very heavy use of objects in
dictionaries,
optimise your objects for that, and write their -isEqual: methods
without using
-hash.