Re: [PATCH] qcow2: Reduce write_zeroes size in handle_alloc

qemu-devel

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH] qcow2: Reduce write_zeroes size in handle_alloc_space()

From:	Vladimir Sementsov-Ogievskiy
Subject:	Re: [PATCH] qcow2: Reduce write_zeroes size in handle_alloc_space()
Date:	Wed, 10 Jun 2020 11:38:39 +0300
User-agent:	Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.8.1

09.06.2020 18:29, Vladimir Sementsov-Ogievskiy wrote:

09.06.2020 18:18, Kevin Wolf wrote:

Am 09.06.2020 um 16:46 hat Eric Blake geschrieben:

On 6/9/20 9:28 AM, Vladimir Sementsov-Ogievskiy wrote:

09.06.2020 17:08, Kevin Wolf wrote:

Since commit c8bb23cbdbe, handle_alloc_space() is called for newly
allocated clusters to efficiently initialise the COW areas with zeros if
necessary. It skips the whole operation if both start_cow nor end_cow
are empty. However, it requests zeroing the whole request size (possibly
multiple megabytes) even if only one end of the request actually needs
this.

This patch reduces the write_zeroes request size in this case so that we
don't unnecessarily zero-initialise a region that we're going to
overwrite immediately.


Hmm, I'm afraid, that this may make things worse in some cases, as with
one big write-zero request
we preallocate data-region in the protocol file, so we have better
locality for the clusters we
are going to write. And, in the same time, with BDRV_REQ_NO_FALLBACK
flag write-zero must be
fast anyway (especially in comparison with the following write request).

Â Â Â Â Â Â Â Â Â  /*
Â Â Â Â Â Â Â Â Â Â  * instead of writing zero COW buffers,
Â Â Â Â Â Â Â Â Â Â  * efficiently zero out the whole clusters
Â Â Â Â Â Â Â Â Â Â  */
-Â Â Â Â Â Â Â  ret = qcow2_pre_write_overlap_check(bs, 0, m->alloc_offset,
-Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â 
Â Â  m->nb_clusters *
s->cluster_size,
-Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â 
Â Â Â Â  true);
+Â Â Â Â Â Â Â  ret = qcow2_pre_write_overlap_check(bs, 0, start, len, true);
Â Â Â Â Â Â Â Â Â  if (ret < 0) {
Â Â Â Â Â Â Â Â Â Â Â Â Â  return ret;
Â Â Â Â Â Â Â Â Â  }
Â Â Â Â Â Â Â Â Â  BLKDBG_EVENT(bs->file, BLKDBG_CLUSTER_ALLOC_SPACE);
-Â Â Â Â Â Â Â  ret = bdrv_co_pwrite_zeroes(s->data_file, m->alloc_offset,
-Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â  
m->nb_clusters * s->cluster_size,
+Â Â Â Â Â Â Â  ret = bdrv_co_pwrite_zeroes(s->data_file, start, len,
Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â  
BDRV_REQ_NO_FALLBACK);


Good point.  If we weren't using BDRV_REQ_NO_FALLBACK, then avoiding a
pre-zero pass over the middle is essential.  But since we are insisting that
the pre-zero pass be fast or else immediately fail, the time spent in
pre-zeroing should not be a concern.  Do you have benchmark numbers stating
otherwise?


I stumbled across this behaviour (write_zeros for 2 MB, then overwrite
almost everything) in the context of a different bug, and it just didn't
make much sense to me. Is there really a file system where fragmentation
is introduced by not zeroing the area first and then overwriting it?

I'm not insisting on making this change because the behaviour is
harmless if odd, but if we think that writing twice to some blocks is an
optimisation, maybe we should actually measure and document this.


Not to same blocks: first we do write-zeroes to the area aligned-up to cluster 
bound. So it's more probable that the resulting clusters would be contigous on 
file-system.. With your patch it may be split into two parts. (a bit too 
theoretical, I'd better prove it by example)

Also, we (Virtuozzo) have to support some custom distributed fs, where 
allocation itself is expensive, so the additional benefit of first (larger) 
write-zero request is that we have one allocation request instead of two (with 
your patch) or three (if we decide to make two write-zero opersions).


Hmm, Denis Lunev said me that double allocation should not be a problem, as it 
is happening almost in the same time, fs should handle this. So probably my 
counter-arguments are wrong. Still, handle_alloc_space() attracts attention 
often, and some benchmark-tests around it will not hurt.


--
Best regards,
Vladimir

[Prev in Thread]

Current Thread

[Next in Thread]

[PATCH] qcow2: Reduce write_zeroes size in handle_alloc_space(), Kevin Wolf, 2020/06/09
- Re: [PATCH] qcow2: Reduce write_zeroes size in handle_alloc_space(), Vladimir Sementsov-Ogievskiy, 2020/06/09
  - Re: [PATCH] qcow2: Reduce write_zeroes size in handle_alloc_space(), Eric Blake, 2020/06/09
    - Re: [PATCH] qcow2: Reduce write_zeroes size in handle_alloc_space(), Kevin Wolf, 2020/06/09
    - Re: [PATCH] qcow2: Reduce write_zeroes size in handle_alloc_space(), Vladimir Sementsov-Ogievskiy, 2020/06/09
    - Re: [PATCH] qcow2: Reduce write_zeroes size in handle_alloc_space(), Vladimir Sementsov-Ogievskiy <=
    - Re: [PATCH] qcow2: Reduce write_zeroes size in handle_alloc_space(), Eric Blake, 2020/06/09
    - Re: [PATCH] qcow2: Reduce write_zeroes size in handle_alloc_space(), Vladimir Sementsov-Ogievskiy, 2020/06/10
    - Re: [PATCH] qcow2: Reduce write_zeroes size in handle_alloc_space(), Kevin Wolf, 2020/06/10
- Re: [PATCH] qcow2: Reduce write_zeroes size in handle_alloc_space(), Eric Blake, 2020/06/09

Prev by Date: Re: [PATCH] hw/vfio/pci-quirks: Fix broken legacy IGD passthrough
Next by Date: Re: [PATCH v2 1/1] tricore: added triboard with tc27x_soc
Previous by thread: Re: [PATCH] qcow2: Reduce write_zeroes size in handle_alloc_space()
Next by thread: Re: [PATCH] qcow2: Reduce write_zeroes size in handle_alloc_space()
Index(es):
- Date
- Thread