qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH] block: don't probe zeroes in bs->file by defaul


From: Vladimir Sementsov-Ogievskiy
Subject: Re: [Qemu-devel] [PATCH] block: don't probe zeroes in bs->file by default on block_status
Date: Fri, 11 Jan 2019 17:27:46 +0000


On 11.01.2019 17:04, Eric Blake wrote:
> On 1/11/19 10:09 AM, Vladimir Sementsov-Ogievskiy wrote:
> 
>>>>> I suggested one: Pass large contiguous allocated ranges to the protocol
>>>>> driver, but just assume that the allocation status is correct in the
>>>>> format layer if they are small.
>>>>
>>>> So, for fully allocated image, we will call lseek always?
>>>
>>> If they are fully contiguous, yes. But that's a single lseek() call per
>>> image then instead of an lseek() for every 64k, so not a big problem.
>>
>> lseek is called on each mirror iteration, why one per image?
> 
> If the image has no holes, then lseek(0, SEEK_HOLE) will return EOF, and
> then you know that there are no holes, and you don't need to make any
> further lseek() calls.  Hence, once per image.  A fully-allocated file
> that has areas that read as known zeroes can be determined by fiemap
> (but not by lseek, which can only detect unallocated holes) - but we
> already know that while fiemap has more information, it also has more
> problems (you cannot use it safely without sync, but sync makes it too
> slow to use), so that is a non-starter.
> 
>>>
>>> In the more realistic case, you will still call lseek() occasionally
>>> because you will have some fragmentation, but the fragments can be
>>> rather large. But it should still significantly reduce them compared to
>>> now because you skip it for those parts with small contiguous
>>> allocations where lseek() would be called a lot today.
>>>
>>> Kevin
>>>
>>
>> Ok, you propose not to call lseek for small enough data regions reported by
>> format layer. And for images which are less fragmented, this helps less or 
>> don't
>> help.
> 
> Indeed - pick some threshold (maybe 16 clusters); if block status of the
> format layer returns something smaller than the threshold, don't bother
> refining the answer further by doing block status of the protocol layer
> (if the caller is iterating over an image 1 cluster at a time, then the
> threshold will never be tripped and thus you'll never do an lseek); but
> where the block status of the format layer is large, we are reading the
> file in larger chunks so we end up with fewer lseeks in the long run
> anyways.
> 
>>
>> Why do you think it is better?
>>
>> For not preallocated images it is worse, as it covers less cases. So, for our
>> scenarios it is worse.
> 
> Anywhere that you skip calling lseek(), you end up missing out on
> opportunities to optimize had you instead been able to learn from lseek
> that you were on a hole after all.

What are the cases when we'll benefit from lseek except preallocated images?

   So it becomes a balancing question:
> how much time is spent on probing for whether an optimization is even
> possible, vs. how much time is spent if the probe succeeded and you can
> then optimize.  For a fully-allocated image, all of the time spent
> probing is wasted (you never find a hole, so every probe was wasted).
> So it is indeed a tradeoff when picking your heuristics, of trying to
> balance how likely the probe will justify the time spent on the probe.
> 
>> The only case, when heuristic works better, is when user have preallocated 
>> image,
>> but don't know about new option, which returns old behavior. We are not 
>> interested
>> in this case and can't go this way, as it doesn't guarantee, that some 
>> customer will
>> not come again with lseek-related problems.
>>
>> Don't you like what Eric propose, about binding behavior switch to existing
>> detect-zeroes option?
>>
>> Or, we can add an opposite option, to enable new behavior, keeping the old 
>> one by
>> default. So, all stays as is, and who need uses new option. Heuristic may be
>> implemented then too.
>>
> 

reply via email to

[Prev in Thread] Current Thread [Next in Thread]