qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [RFC] Re-evaluating subcluster allocation for qcow2 ima


From: Denis Lunev
Subject: Re: [Qemu-devel] [RFC] Re-evaluating subcluster allocation for qcow2 images
Date: Fri, 28 Jun 2019 14:47:32 +0000

On 6/28/19 5:43 PM, Alberto Garcia wrote:
> On Thu 27 Jun 2019 06:05:55 PM CEST, Denis Lunev wrote:
>>>> Thus in respect to this patterns subclusters could give us benefits
>>>> of fast random IO and good reclaim rate.
>>> Exactly, but that fast random I/O would only happen when allocating
>>> new clusters. Once the clusters are allocated it doesn't provide any
>>> additional performance benefit.
>> No, I am talking about the situation after the allocation. That is the
>> main point why I have a feeling that sub-cluster could provide a
>> benefit.
>>
>> OK. The situation (1) is the following:
>> - the disk is completely allocated
>> - QCOW2 image size is 8 Tb
>> - we have image with 1 Mb cluster/64k sub-cluster (for simplicity)
>> - L2 metadata cache size is 128 Mb (64 Mb L2 tables, 64 Mb other data)
>> - holes are made on a sub-cluster bases, i.e. with 64 Kb granularity
>>
>> In this case random IO test will give near native IOPS
>> result. Metadata is in memory, no additional reads are
>> required. Wasted host filesystem space (due to cluster size) is kept
>> at minimum, i.e. on the level of the "pre-subcluster" QCOW2.
>>
>> Situation (2):
>> - 8 Tb QCOW2 image is completely allocated
>> - 1 Mb cluster size, 128 Mb L2 cache size
>>
>> Near same performance as (1), but much less disk space savings for
>> holes.
>>
>> Situation (3):
>> - 8 Tb QCOW2 image, completely allocated
>> - 64 Kb cluster size, 128 MB L2 cache
>>
>> Random IO performance halved from (1) and (2) due to metadata re-read
>> for each subsequent operation. Same disk space savings as in case (1).
> If I understood correctly what you are trying to say, subclusters allow
> us to use larger cluster sizes in order to reduce the amount of L2
> metadata (and therefore the cache size) while keeping the same space
> benefits as smaller clusters.
yes

>> Please note, I am not talking now about your case with COW. Here the
>> allocation is performed on the sub-cluster basis, i.e. the abscence of
>> the sub-cluster in the image means hole on that offset. This is
>> important difference.
> I mentioned the possibility that if you have a case like 2MB / 64KB and
> you write to an empty cluster then you could allocate the necessary
> subclusters, and additionally fallocate() the space of the whole cluster
> (2MB) in order to try to keep it contiguous.
>
> With this we would lose the space saving advantage of having
> subclusters. But perhaps that would work for smaller cluster sizes (it
> would mitigate the fragmentation problem).
yes, this is distinction and completely different usecase.
We have obtained it over time from our customers,
who wants very fast performance AND space conservation
at once. This is still the case for SSD users, which are
fast but small.

Den

reply via email to

[Prev in Thread] Current Thread [Next in Thread]