[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] RFC: Reducing the size of entries in the qcow2 L2 cache
From: |
Alberto Garcia |
Subject: |
Re: [Qemu-devel] RFC: Reducing the size of entries in the qcow2 L2 cache |
Date: |
Wed, 20 Sep 2017 15:10:45 +0200 |
User-agent: |
Notmuch/0.18.2 (http://notmuchmail.org) Emacs/24.4.1 (i586-pc-linux-gnu) |
On Wed 20 Sep 2017 09:06:20 AM CEST, Kevin Wolf wrote:
>> |-----------+--------------+-------------+---------------+--------------|
>> | Disk size | Cluster size | L2 cache | Standard QEMU | Patched QEMU |
>> |-----------+--------------+-------------+---------------+--------------|
>> | 16 GB | 64 KB | 1 MB [8 GB] | 5000 IOPS | 12700 IOPS |
>> | 2 TB | 2 MB | 4 MB [1 TB] | 576 IOPS | 11000 IOPS |
>> |-----------+--------------+-------------+---------------+--------------|
>>
>> The improvements are clearly visible, but it's important to point out
>> a couple of things:
>>
>> - L2 cache size is always < total L2 metadata on disk (otherwise
>> this wouldn't make sense). Increasing the L2 cache size improves
>> performance a lot (and makes the effect of these patches
>> disappear), but it requires more RAM.
>
> Do you have the numbers for the two cases abve if the L2 tables
> covered the whole image?
Yeah, sorry, it's around 60000 IOPS in both cases (more or less what I
also get with a raw image).
>> - Doing random reads over the whole disk is probably not a very
>> realistic scenario. During normal usage only certain areas of the
>> disk need to be accessed, so performance should be much better
>> with the same amount of cache.
>> - I wrote a best-case scenario test (several I/O jobs each accesing
>> a part of the disk that requires loading its own L2 table) and my
>> patched version is 20x faster even with 64KB clusters.
>
> I suppose you choose the scenario so that the number of jobs is larger
> than the number of cached L2 tables without the patch, but smaller than
> than the number of cache entries with the patch?
Exactly, I should have made that explicit :) I had 32 jobs, each one of
them limited to a small area (32MB), so with 4K pages you only need
128KB of cache memory (vs 2MB with the current code).
> We will probably need to do some more benchmarking to find a good
> default value for the cached chunks. 4k is nice and small, so we can
> cover many parallel jobs without using too much memory. But if we have
> a single sequential job, we may end up doing the metadata updates in
> small 4k chunks instead of doing a single larger write.
Right, although a 4K table can already hold pointers to 512 data
clusters, so even if you do sequential I/O you don't need to update the
metadata so often, do you?
I guess the default value should probably depend on the cluster size.
>> - We need a proper name for these sub-tables that we are loading
>> now. I'm actually still struggling with this :-) I can't think of
>> any name that is clear enough and not too cumbersome to use (L2
>> subtables? => Confusing. L3 tables? => they're not really that).
>
> L2 table chunk? Or just L2 cache entry?
Yeah, something like that, but let's see how variables end up being
named :)
Berto