[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] [PATCH 1/1] mirror: double performance of the bulk stag
From: |
Kevin Wolf |
Subject: |
Re: [Qemu-devel] [PATCH 1/1] mirror: double performance of the bulk stage if the disc is full |
Date: |
Tue, 12 Jul 2016 15:51:08 +0200 |
User-agent: |
Mutt/1.5.21 (2010-09-15) |
Am 12.07.2016 um 11:36 hat Denis V. Lunev geschrieben:
> From: Vladimir Sementsov-Ogievskiy <address@hidden>
>
> Mirror can do up to 16 in-flight requests, but actually on full copy
> (the whole source disk is non-zero) in-flight is always 1. This happens
> as the request is not limited in size: the data occupies maximum available
> capacity of s->buf.
>
> The patch limits the size of the request to some artificial constant
> (1 Mb here), which is not that big or small. This effectively enables
> back parallelism in mirror code as it was designed.
>
> The result is important: the time to migrate 10 Gb disk is reduced from
> ~350 sec to 170 sec.
>
> Signed-off-by: Vladimir Sementsov-Ogievskiy <address@hidden>
> Signed-off-by: Denis V. Lunev <address@hidden>
> CC: Stefan Hajnoczi <address@hidden>
> CC: Fam Zheng <address@hidden>
> CC: Kevin Wolf <address@hidden>
> CC: Max Reitz <address@hidden>
> CC: Jeff Cody <address@hidden>
> CC: Eric Blake <address@hidden>
> ---
> block/mirror.c | 8 ++++++--
> 1 file changed, 6 insertions(+), 2 deletions(-)
>
> diff --git a/block/mirror.c b/block/mirror.c
> index 4fe127e..53d3bcd 100644
> --- a/block/mirror.c
> +++ b/block/mirror.c
> @@ -23,7 +23,9 @@
>
> #define SLICE_TIME 100000000ULL /* ns */
> #define MAX_IN_FLIGHT 16
> -#define DEFAULT_MIRROR_BUF_SIZE (10 << 20)
> +#define MAX_IO_SECTORS ((1 << 20) >> BDRV_SECTOR_BITS) /* 1 Mb */
> +#define DEFAULT_MIRROR_BUF_SIZE \
> + (MAX_IN_FLIGHT * MAX_IO_SECTORS * BDRV_SECTOR_SIZE)
>
> /* The mirroring buffer is a list of granularity-sized chunks.
> * Free chunks are organized in a list.
> @@ -387,7 +389,9 @@ static uint64_t coroutine_fn
> mirror_iteration(MirrorBlockJob *s)
> nb_chunks * sectors_per_chunk,
> &io_sectors, &file);
> if (ret < 0) {
> - io_sectors = nb_chunks * sectors_per_chunk;
> + io_sectors = MIN(nb_chunks * sectors_per_chunk, MAX_IO_SECTORS);
> + } else if (ret & BDRV_BLOCK_DATA) {
> + io_sectors = MIN(io_sectors, MAX_IO_SECTORS);
> }
Would it make sense to consider the actual buffer size? If we have
s->buf_size / 16 > 1 MB, then this is wasting buffer space.
On the other hand, there is probably a minimum size where using a single
larger buffer performs better than two concurrent small ones. Which size
this is, is hard to tell, though. If we assume that 1 MB is a good
default (should we do some more testing to find the sweet spot?), we
could write this as:
io_sectors = MIN(io_sectors,
MAX((s->buf_size / BDRV_SECTOR_SIZE) / MAX_IN_FLIGHT,
MAX_IO_SECTORS))
Kevin