[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-block] [PATCH v7 8/9] qcow2: skip writing zero buffers to empt
From: |
Max Reitz |
Subject: |
Re: [Qemu-block] [PATCH v7 8/9] qcow2: skip writing zero buffers to empty COW areas |
Date: |
Wed, 31 Jan 2018 18:40:43 +0100 |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.5.2 |
On 2018-01-30 15:23, Anton Nefedov wrote:
>
>
> On 29/1/2018 11:28 PM, Max Reitz wrote:
>> On 2018-01-18 18:49, Anton Nefedov wrote:
>>> If COW areas of the newly allocated clusters are zeroes on the
>>> backing image,
>>> efficient bdrv_write_zeroes(flags=BDRV_REQ_ALLOCATE) can be used on
>>> the whole
>>> cluster instead of writing explicit zero buffers later in perform_cow().
>>>
>>> iotest 060:
>>> write to the discarded cluster does not trigger COW anymore.
>>> Use a backing image instead.
>>>
>>> iotest 066:
>>> cluster-alignment areas that were not really COWed are now detected
>>> as zeroes, hence the initial write has to be exactly the same size for
>>> the maps to match
>>>
>>> Signed-off-by: Anton Nefedov <address@hidden>
>>> ---
>>> qapi/block-core.json | 4 ++-
>>> block/qcow2.h | 6 +++++
>>> block/qcow2-cluster.c | 2 +-
>>> block/qcow2.c | 66
>>> ++++++++++++++++++++++++++++++++++++++++++++--
>>> block/trace-events | 1 +
>>> tests/qemu-iotests/060 | 26 +++++++++++-------
>>> tests/qemu-iotests/060.out | 5 +++-
>>> tests/qemu-iotests/066 | 2 +-
>>> tests/qemu-iotests/066.out | 4 +--
>>> 9 files changed, 98 insertions(+), 18 deletions(-)
>>
>> [...]
>>
>>> @@ -1875,6 +1880,52 @@ static bool is_zero(BlockDriverState *bs,
>>> int64_t offset, int64_t bytes)
>>> return res >= 0 && (res & BDRV_BLOCK_ZERO) && nr == bytes;
>>> }
>>> +static bool is_zero_cow(BlockDriverState *bs, QCowL2Meta *m)
>>> +{
>>> + return is_zero(bs, m->offset + m->cow_start.offset,
>>> + m->cow_start.nb_bytes) &&
>>> + is_zero(bs, m->offset + m->cow_end.offset,
>>> m->cow_end.nb_bytes);
>>> +}
>>> +
>>> +static int handle_alloc_space(BlockDriverState *bs, QCowL2Meta *l2meta)
>>> +{
>>> + BDRVQcow2State *s = bs->opaque;
>>> + QCowL2Meta *m;
>>> +
>>> + for (m = l2meta; m != NULL; m = m->next) {
>>> + int ret;
>>> +
>>> + if (!m->cow_start.nb_bytes && !m->cow_end.nb_bytes) {
>>> + continue;
>>> + }
>>> +
>>> + if (bs->encrypted) {
>>> + continue;
>>> + }
>>
>> Not sure if the compiler optimizes this anyway, but I'd pull this out of
>> the loop.
>>
>
> An imprint of the following patches (which were dropped from this
> series) - preallocation ahead of image EOF, which takes action
> regardless of image encryption.
>
> But I'll leave the check outside the loop until it's needed
> there.
>
>
>> Maybe you could put all of the conditions under which this function can
>> actually do something at its beginning: That is, we can't do anything if
>> the BDS is encrypted or if bs->file does not support BDRV_REQ_ALLOCATE
>> (and then you just call this function unconditionally in
>> qcow2_co_pwritev()).
>>
>
> Done.
>
>>> + if (!is_zero_cow(bs, m)) {
>>> + continue;
>>> + }
>>
>> Is this really efficient? I remember someone complaining about
>> bdrv_co_block_status() being kind of slow on some filesystems, so
>> there'd be a tradeoff depending on how it compares to just reading up to
>> two clusters from the backing file -- especially considering that the OS
>> can query the same information just as quickly, and thus the only
>> overhead the read should have is a memset() (assuming the OS is clever).
>>
>> So basically my question is whether it would be better to just skip this
>> if we have any backing file at all and only do this optimization if
>> there is none.
>>
>
> So what we trade between is
> (read+write) vs (lseek+fallocate or lseek+read+write).
>
> Indeed if it comes to lseek the profit is smaller, and we're probably
> unlikely to find a hole anyway.
>
> Maybe it's good enough to cover these cases:
> 1. no backing
> 2. beyond bdrv_getlength() in backing
> 3. unallocated in backing qcow2 (covers 'beyond bdrv_getlength()
> in backing->file')
>
> 1 & 2 are easy to check;
Not sure how useful 2 is, though. (I don't know. I always hear about
people wanting to optimize for such a case where a backing file is
shorter than the overlay, but I can't imagine a real use case for that.)
> 3: if that's not too hacky maybe we can do the bdrv_is_allocated() check
> for qcow2 exclusively and if there is raw (or any other format) backing
> image - do the COW
Yes, well, it seems useful in principle... But it would require general
block layer work indeed. Maybe a new flag for bdrv_block_status*() that
says only to look into the format layer?
Max
signature.asc
Description: OpenPGP digital signature
- [Qemu-block] [PATCH v7 9/9] iotest 134: test cluster-misaligned encrypted write, (continued)
[Qemu-block] [PATCH v7 4/9] block: treat BDRV_REQ_ALLOCATE as serialising, Anton Nefedov, 2018/01/18
Re: [Qemu-block] [PATCH v7 4/9] block: treat BDRV_REQ_ALLOCATE as serialising, Alberto Garcia, 2018/01/31