qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH] block: don't probe zeroes in bs->file by defaul


From: Eric Blake
Subject: Re: [Qemu-devel] [PATCH] block: don't probe zeroes in bs->file by default on block_status
Date: Fri, 11 Jan 2019 11:04:32 -0600
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.3.1

On 1/11/19 10:09 AM, Vladimir Sementsov-Ogievskiy wrote:

>>>> I suggested one: Pass large contiguous allocated ranges to the protocol
>>>> driver, but just assume that the allocation status is correct in the
>>>> format layer if they are small.
>>>
>>> So, for fully allocated image, we will call lseek always?
>>
>> If they are fully contiguous, yes. But that's a single lseek() call per
>> image then instead of an lseek() for every 64k, so not a big problem.
> 
> lseek is called on each mirror iteration, why one per image?

If the image has no holes, then lseek(0, SEEK_HOLE) will return EOF, and
then you know that there are no holes, and you don't need to make any
further lseek() calls.  Hence, once per image.  A fully-allocated file
that has areas that read as known zeroes can be determined by fiemap
(but not by lseek, which can only detect unallocated holes) - but we
already know that while fiemap has more information, it also has more
problems (you cannot use it safely without sync, but sync makes it too
slow to use), so that is a non-starter.

>>
>> In the more realistic case, you will still call lseek() occasionally
>> because you will have some fragmentation, but the fragments can be
>> rather large. But it should still significantly reduce them compared to
>> now because you skip it for those parts with small contiguous
>> allocations where lseek() would be called a lot today.
>>
>> Kevin
>>
> 
> Ok, you propose not to call lseek for small enough data regions reported by
> format layer. And for images which are less fragmented, this helps less or 
> don't
> help.

Indeed - pick some threshold (maybe 16 clusters); if block status of the
format layer returns something smaller than the threshold, don't bother
refining the answer further by doing block status of the protocol layer
(if the caller is iterating over an image 1 cluster at a time, then the
threshold will never be tripped and thus you'll never do an lseek); but
where the block status of the format layer is large, we are reading the
file in larger chunks so we end up with fewer lseeks in the long run
anyways.

> 
> Why do you think it is better?
> 
> For not preallocated images it is worse, as it covers less cases. So, for our
> scenarios it is worse.

Anywhere that you skip calling lseek(), you end up missing out on
opportunities to optimize had you instead been able to learn from lseek
that you were on a hole after all.  So it becomes a balancing question:
how much time is spent on probing for whether an optimization is even
possible, vs. how much time is spent if the probe succeeded and you can
then optimize.  For a fully-allocated image, all of the time spent
probing is wasted (you never find a hole, so every probe was wasted).
So it is indeed a tradeoff when picking your heuristics, of trying to
balance how likely the probe will justify the time spent on the probe.

> The only case, when heuristic works better, is when user have preallocated 
> image,
> but don't know about new option, which returns old behavior. We are not 
> interested
> in this case and can't go this way, as it doesn't guarantee, that some 
> customer will
> not come again with lseek-related problems.
> 
> Don't you like what Eric propose, about binding behavior switch to existing
> detect-zeroes option?
> 
> Or, we can add an opposite option, to enable new behavior, keeping the old 
> one by
> default. So, all stays as is, and who need uses new option. Heuristic may be
> implemented then too.
> 

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3226
Virtualization:  qemu.org | libvirt.org

Attachment: signature.asc
Description: OpenPGP digital signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]