[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: hash resizing bug
From: |
Jim Meyering |
Subject: |
Re: hash resizing bug |
Date: |
Thu, 18 Jun 2009 16:18:56 +0200 |
Eric Blake wrote:
> According to Jim Meyering on 6/17/2009 1:21 PM:
>>> is not quite right. A hashing function may be somewhat ill-conditioned for
>>> one
>>> bucket size, but much better conditioned for another size. At any rate, it
>>> looks like the correct fix is to check for rehashing thresholds based on
>>> number
>>> of entries, not number of buckets, and that this check needs to happen on
>>> every
>>> insertion, not just insertions that land in an empty bucket. Or, at the
>>> very
>>> least, check for threshold PRIOR to insertion, rather than after, such that
>>> in
>>> the above case with m4, the very next call to hash_insert would notice the
>>> table was at 100%, and force another resize above 907 buckets.
>>>
>>> Thoughts, before I implement something along these lines?
>>
>> Sounds good.
>
> Hey - I just noticed that this also fixed the bug where hash_insert was
> inconsistent on NULL return whether an entry had been inserted or not.
> Now, you are guaranteed that a NULL return means the entry was not inserted.
>
> Here's two simple patches. I'm committing the first, as there should be
> enough cases where you end up searching for an element already known to be
> hashed, where making an indirect function call is needless overhead
> (particularly if the comparator function is expensive, and neglected to do
> the pointer comparison filter itself). And by documenting this behavior,
> the user's comparator functions can now be safely rewritten to avoid the
> pointer comparison filter.
Unless I become exceptionally unresponsive,
please post before pushing changes to hash.c.
I would have requested that you not mix clean-up like this with
a semantics-changing change set:
+ (hash_do_for_each, hash_clear, hash_free): Use C89 syntax.
- hash resizing bug (was: [PATCH] hash: declare some functions with the warn_unused_result attribute), (continued)
- hash resizing bug (was: [PATCH] hash: declare some functions with the warn_unused_result attribute), Eric Blake, 2009/06/17
- Re: hash resizing bug, Eric Blake, 2009/06/17
- Re: hash resizing bug, Eric Blake, 2009/06/17
- Re: hash resizing bug, Jim Meyering, 2009/06/17
- Re: hash resizing bug, Eric Blake, 2009/06/17
- Re: hash resizing bug, Jim Meyering, 2009/06/17
- Re: hash resizing bug, Eric Blake, 2009/06/17
- Re: hash resizing bug, Jim Meyering, 2009/06/18
- Re: hash resizing bug, Jim Meyering, 2009/06/17
- Re: hash resizing bug, Eric Blake, 2009/06/18
- Re: hash resizing bug,
Jim Meyering <=
- Re: hash resizing bug, Jim Meyering, 2009/06/18
- Re: hash resizing bug, Jim Meyering, 2009/06/18
- Re: hash resizing bug, Eric Blake, 2009/06/18
- Re: hash resizing bug, Jim Meyering, 2009/06/18
- Re: hash resizing bug, Eric Blake, 2009/06/18
- hash and bitrotate (was: hash resizing bug), Eric Blake, 2009/06/18
- Re: hash and bitrotate, Eric Blake, 2009/06/18
- Re: hash and bitrotate, Jim Meyering, 2009/06/18
- Re: hash and bitrotate, Simon Josefsson, 2009/06/18
- Re: hash resizing bug, Jim Meyering, 2009/06/18