[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-block] [PATCH 5/6] qcow2: use a hash to look for entries in th
From: |
Alberto Garcia |
Subject: |
Re: [Qemu-block] [PATCH 5/6] qcow2: use a hash to look for entries in the L2 cache |
Date: |
Thu, 07 May 2015 10:23:17 +0200 |
User-agent: |
Notmuch/0.13.2 (http://notmuchmail.org) Emacs/23.2.1 (i486-pc-linux-gnu) |
On Wed 06 May 2015 06:42:30 PM CEST, Stefan Hajnoczi wrote:
>> By using a hash of the offset as the starting point, lookups are
>> faster and independent from the array size.
>>
>> The hash is computed using the cluster number of the table, multiplied
>> by 4 to make it perform better when there are collisions.
> ...
>> + i = lookup_index = (offset / c->table_size * 4) % c->size;
> The hash function is very weak. It's better than always starting with
> i=0 but mixes input bits very poorly. Have you considered an integer
> hash function so that more input bits affect the output?
> http://burtleburtle.net/bob/hash/integer.html
I didn't try very hard to optimize the hash function because it doesn't
have a big impact, I just wanted something simple and a bit better than
i=0.
I did my tests with a disk image with preallocated metadata and a cache
size big enough for the whole disk. In that scenario I managed an
average of 2.5 comparisons per lookup with this function.
> Regarding collisions, the multiply-by-4 effectively reduces the hash
> space by a factor of 4. This *increases* the chance of collisions in
> the hope that the extra slots will prevent chains from forming.
It does, but it keeps the average much lower (in my tests at least), it
went down from >100 to 2.5.
> Feel free to leave the hash function as-is, I don't think it's a huge
> issue either way.
I think I'll just leave it as-is for the time being, but thanks a lot
for your comments.
Berto