Re: [Qemu-block] [PATCH] mirror: Improve zero-write and discard with fra

qemu-block

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-block] [PATCH] mirror: Improve zero-write and discard with fra

From:	Paolo Bonzini
Subject:	Re: [Qemu-block] [PATCH] mirror: Improve zero-write and discard with fragmented image
Date:	Tue, 10 Nov 2015 10:01:54 +0100
User-agent:	Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.3.0


On 10/11/2015 07:14, Fam Zheng wrote:
> On Mon, 11/09 17:29, Kevin Wolf wrote:
>> Am 09.11.2015 um 17:18 hat Paolo Bonzini geschrieben:
>>>
>>>
>>> On 09/11/2015 17:04, Kevin Wolf wrote:
>>>> Am 06.11.2015 um 11:22 hat Fam Zheng geschrieben:
>>>>> The "pnum < nb_sectors" condition in deciding whether to actually copy
>>>>> data is unnecessarily strict, and the qiov initialization is
>>>>> unnecessarily too, for both bdrv_aio_write_zeroes and bdrv_aio_discard
>>>>> branches.
>>>>>
>>>>> Reorganize mirror_iteration flow so that we:
>>>>>
>>>>>     1) Find the contiguous zero/discarded sectors with
>>>>>     bdrv_get_block_status_above() before deciding what to do. We query
>>>>>     s->buf_size sized blocks at a time.
>>>>>
>>>>>     2) If the sectors in question are zeroed/discarded and aligned to
>>>>>     target cluster, issue zero write or discard accordingly. It's done
>>>>>     in mirror_do_zero_or_discard, where we don't add buffer to qiov.
>>>>>
>>>>>     3) Otherwise, do the same loop as before in mirror_do_read.
>>>>>
>>>>> Signed-off-by: Fam Zheng <address@hidden>
>>>>
>>>> I'm not sure where in the patch to comment on this, so I'll just do it
>>>> here right in the beginning.
>>>>
>>>> I'm concerned that we need to be more careful about races in this patch,
>>>> in particular regarding the bitmaps. I think the conditions for the two
>>>> bitmaps are:
>>>>
>>>> * Dirty bitmap: We must clear the bit after finding the next piece of
>>>>   data to be mirrored, but before we yield after getting information
>>>>   that we use for the decision which kind of operation we need.
>>>>
>>>>   In other words, we need to clear the dirty bitmap bit before calling
>>>>   bdrv_get_block_status_above(), because that's both the function that
>>>>   retrieves information about the next chunk and also a function that
>>>>   can yield.
>>>>
>>>>   If after this point the data is written to, we need to mirror it
>>>>   again.
>>>
>>> With Fam's patch, that's not trivial for two reasons:
>>>
>>> 1) bdrv_get_block_status_above() can return a smaller amount than what
>>> is asked.
>>>
>>> 2) the "read and write" case can handle s->granularity sectors per
>>> iteration (many of them can be coalesced, but still that's how the
>>> iteration works).
>>>
>>> The simplest solution is to perform the query with s->granularity size
>>> rather than s->buf_size.
>>
>> Then we end up with many small operations, that's not what we want.
>>
>> Why can't we mark up to s->buf_size dirty clusters as clean first, then
>> query the status, and mark all of those that we can't handle dirty
>> again?
> 
> Then we may end up marking more clusters as dirty than it should be.

You're both right.

> Because all bdrv_set_dirty() and bdrv_set_dirty_bitmap() callers are 
> coroutine,
> we can introduce a CoMutex to let bitmap reader block bdrv_set_dirty and
> bdrv_set_dirty_bitmap.

I think this is not necessary.

I think the following is safe:

1) before calling bdrv_get_block_status_above(), find out how many
consecutive bits in the dirty bitmap are 1

2) zero all those bits in the dirty bitmap

3) call bdrv_get_block_status_above() with a size equivalent to the
number of dirty bits

4) if bdrv_get_block_status_above() only returns a partial result, loop
step (3) until all the dirty bits are processed

For full mirroring, this strategy will probably make the first
incremental iteration more expensive.

Paolo

[Prev in Thread]

Current Thread

[Next in Thread]

[Qemu-block] [PATCH] mirror: Improve zero-write and discard with fragmented image, Fam Zheng, 2015/11/06
- Re: [Qemu-block] [PATCH] mirror: Improve zero-write and discard with fragmented image, Max Reitz, 2015/11/06
  - Re: [Qemu-block] [PATCH] mirror: Improve zero-write and discard with fragmented image, Fam Zheng, 2015/11/08
- Re: [Qemu-block] [PATCH] mirror: Improve zero-write and discard with fragmented image, Kevin Wolf, 2015/11/09
  - Re: [Qemu-block] [PATCH] mirror: Improve zero-write and discard with fragmented image, Paolo Bonzini, 2015/11/09
    - Re: [Qemu-block] [PATCH] mirror: Improve zero-write and discard with fragmented image, Kevin Wolf, 2015/11/09
    - Re: [Qemu-block] [PATCH] mirror: Improve zero-write and discard with fragmented image, Fam Zheng, 2015/11/10
    - Re: [Qemu-block] [PATCH] mirror: Improve zero-write and discard with fragmented image, Paolo Bonzini <=
    - Re: [Qemu-block] [PATCH] mirror: Improve zero-write and discard with fragmented image, Kevin Wolf, 2015/11/10
    - Re: [Qemu-block] [PATCH] mirror: Improve zero-write and discard with fragmented image, Paolo Bonzini, 2015/11/10

Prev by Date: [Qemu-block] [PATCH v3 2/2] mirror: Improve zero-write and discard with fragmented image
Next by Date: Re: [Qemu-block] [Qemu-devel] [PATCH v6 3/4] qmp: add monitor command to add/remove a child
Previous by thread: Re: [Qemu-block] [PATCH] mirror: Improve zero-write and discard with fragmented image
Next by thread: Re: [Qemu-block] [PATCH] mirror: Improve zero-write and discard with fragmented image
Index(es):
- Date
- Thread