[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-block] [PATCH 1/1] mirror: double performance of the bulk stag

From: Vladimir Sementsov-Ogievskiy
Subject: Re: [Qemu-block] [PATCH 1/1] mirror: double performance of the bulk stage if the disc is full
Date: Wed, 13 Jul 2016 11:00:28 +0300
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.6.0

On 12.07.2016 16:51, Kevin Wolf wrote:
Am 12.07.2016 um 11:36 hat Denis V. Lunev geschrieben:
From: Vladimir Sementsov-Ogievskiy <address@hidden>

Mirror can do up to 16 in-flight requests, but actually on full copy
(the whole source disk is non-zero) in-flight is always 1. This happens
as the request is not limited in size: the data occupies maximum available
capacity of s->buf.

The patch limits the size of the request to some artificial constant
(1 Mb here), which is not that big or small. This effectively enables
back parallelism in mirror code as it was designed.

The result is important: the time to migrate 10 Gb disk is reduced from
~350 sec to 170 sec.

Signed-off-by: Vladimir Sementsov-Ogievskiy <address@hidden>
Signed-off-by: Denis V. Lunev <address@hidden>
CC: Stefan Hajnoczi <address@hidden>
CC: Fam Zheng <address@hidden>
CC: Kevin Wolf <address@hidden>
CC: Max Reitz <address@hidden>
CC: Jeff Cody <address@hidden>
CC: Eric Blake <address@hidden>
  block/mirror.c | 8 ++++++--
  1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/block/mirror.c b/block/mirror.c
index 4fe127e..53d3bcd 100644
--- a/block/mirror.c
+++ b/block/mirror.c
@@ -23,7 +23,9 @@
#define SLICE_TIME 100000000ULL /* ns */
  #define MAX_IN_FLIGHT 16
-#define DEFAULT_MIRROR_BUF_SIZE   (10 << 20)
+#define MAX_IO_SECTORS ((1 << 20) >> BDRV_SECTOR_BITS) /* 1 Mb */
/* The mirroring buffer is a list of granularity-sized chunks.
   * Free chunks are organized in a list.
@@ -387,7 +389,9 @@ static uint64_t coroutine_fn 
mirror_iteration(MirrorBlockJob *s)
                                            nb_chunks * sectors_per_chunk,
                                            &io_sectors, &file);
          if (ret < 0) {
-            io_sectors = nb_chunks * sectors_per_chunk;
+            io_sectors = MIN(nb_chunks * sectors_per_chunk, MAX_IO_SECTORS);
+        } else if (ret & BDRV_BLOCK_DATA) {
+            io_sectors = MIN(io_sectors, MAX_IO_SECTORS);
Would it make sense to consider the actual buffer size? If we have
s->buf_size / 16 > 1 MB, then this is wasting buffer space.

On the other hand, there is probably a minimum size where using a single
larger buffer performs better than two concurrent small ones. Which size
this is, is hard to tell, though. If we assume that 1 MB is a good
default (should we do some more testing to find the sweet spot?), we
could write this as:

   io_sectors = MIN(io_sectors,
                    MAX((s->buf_size / BDRV_SECTOR_SIZE) / MAX_IN_FLIGHT,


Ok, thanks, will resend.

Best regards,

reply via email to

[Prev in Thread] Current Thread [Next in Thread]