[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] [PATCH] qemu-img: align is_allocated_sectors to 4k
From: |
Peter Lieven |
Subject: |
Re: [Qemu-devel] [PATCH] qemu-img: align is_allocated_sectors to 4k |
Date: |
Mon, 25 Jun 2018 22:29:50 +0200 |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.2.1 |
Am 11.06.2018 um 16:04 schrieb Max Reitz:
> On 2018-06-11 15:59, Peter Lieven wrote:
>> Am 11.06.2018 um 15:30 schrieb Max Reitz:
>>> On 2018-06-07 14:46, Peter Lieven wrote:
>>>> We currently don't enforce that the sparse segments we detect during
>>>> convert are
>>>> aligned. This leads to unnecessary and costly read-modify-write
>>>> cycles either
>>>> internally in Qemu or in the background on the storage device as
>>>> nearly all
>>>> modern filesystems or hardware has a 4k alignment internally.
>>>>
>>>> As we per default set the min_sparse size to 4k it makes perfectly
>>>> sense to ensure
>>>> that these sparse holes in the file are placed at 4k boundaries.
>>>>
>>>> The number of RMW cycles when converting an example image [1] to a
>>>> raw device that
>>>> has 4k sector size is about 4600 4k read requests to perform a total
>>>> of about 15000
>>>> write requests. With this path the 4600 additional read requests are
>>>> eliminated.
>>>>
>>>> [1]
>>>> https://cloud-images.ubuntu.com/releases/16.04/release/ubuntu-16.04-server-cloudimg-amd64-disk1.vmdk
>>>>
>>>>
>>>> Signed-off-by: Peter Lieven <address@hidden>
>>>> ---
>>>> qemu-img.c | 21 +++++++++++++++------
>>>> 1 file changed, 15 insertions(+), 6 deletions(-)
>>> I like the idea, but it doesn't seem guaranteed that
>>> is_allocated_sectors() is called on aligned offsets, so this alignment
>>> work may still leave things unaligned.
>> I can't image why this should happen. As long as the alignment devides
>> the buffer size we either
>> write or skip aligned bytes. Maybe get_block_status returns an unaligned
>> number of sectors?
> Yes, because the source medium does not need to be the same as the
> destination (so the source may have e.g. 512-byte clusters).
>
>>> Furthermore, we should probably not blindly assume 4k but instead use
>>> some block limit of the target, like pwrite_zeroes_alignment, or
>>> pdiscard_alignment, depending on the case. (Or probably still
>>> min_sparse, if that's less.)
>>>
>>> Since is_allocated_sectors_min() (the only caller of
>>> is_allocated_sectors()) is called from just a single place, taking those
>>> factors into account should be possible.
>> I also thought of this, but for instance for raw-posix I always get a
>> request_alignment of 1.
> Yes, because request_alignment is a hard requirement. With caching, you
> can send requests with any alignment, so it's 1.
>
> pwrite_zeroes_alignment and pdiscard_alignment are described as "Optimal
> alignment", so those should contain the values we/you want. If they are
> 0, then you should probably fall back to opt_transfer instead of
> request_alignment.
I am still trying to figure out what is the best solution. If I take the optima
into
account I might ending up transfering more data than necessary just to create
an optimal
request. I just want to avoid unnecessary RMW cycles. And even if modern byte
interfaces
advertise a request_alignment of 1 someone has to do the RMW cycle. Either the
OS or the
harddrive itself.
I am thinking about sth like
alignment = MAX(request_alignment, opt_transfer, min_sparse)
as a starting point?
I found that opt_transfer seems to be 0 for everything I found to test.
So maybe even reduce the alignment to MAX(request_alignment, min_sparse).
Peter