[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-block] [Qemu-devel] [PATCH v5 4/4] block: ignore flush request

From: Evgeny Yakovlev
Subject: Re: [Qemu-block] [Qemu-devel] [PATCH v5 4/4] block: ignore flush requests when storage is clean
Date: Fri, 8 Jul 2016 18:19:11 +0300
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.8.0

On 08.07.2016 02:04, Eric Blake wrote:
On 07/04/2016 08:38 AM, Denis V. Lunev wrote:
From: Evgeny Yakovlev <address@hidden>

Some guests (win2008 server for example) do a lot of unnecessary
flushing when underlying media has not changed. This adds additional
overhead on host when calling fsync/fdatasync.

This change introduces a write generation scheme in BlockDriverState.
Current write generation is checked against last flushed generation to
avoid unnessesary flushes.

The problem with excessive flushing was found by a performance test
which does parallel directory tree creation (from 2 processes).
Results improved from 0.424 loops/sec to 0.432 loops/sec.
Each loop creates 10^3 directories with 10 files in each.

+++ b/block/io.c
@@ -1294,6 +1294,7 @@ static int coroutine_fn 
bdrv_aligned_pwritev(BlockDriverState *bs,
      bdrv_debug_event(bs, BLKDBG_PWRITEV_DONE);
+ ++bs->write_gen;
Why pre-increment?  Most code uses post-increment when done as a
statement in isolation.

Just a habit of mine, from C++ days, where you can never be sure if someone overloaded post-increment operator and it will then generate a temporary object because post-increment is defined to return previous value. Now it's just a muscle memory :)

      bdrv_set_dirty(bs, start_sector, end_sector - start_sector);
if (bs->wr_highest_offset < offset + bytes) {
@@ -2211,6 +2212,7 @@ int coroutine_fn bdrv_co_flush(BlockDriverState *bs)
      int ret;
      BdrvTrackedRequest req;
+    int current_gen = bs->write_gen;
if (!bs || !bdrv_is_inserted(bs) || bdrv_is_read_only(bs) ||
          bdrv_is_sg(bs)) {
@@ -2219,6 +2221,12 @@ int coroutine_fn bdrv_co_flush(BlockDriverState *bs)
tracked_request_begin(&req, bs, 0, 0, BDRV_TRACKED_FLUSH); + /* Wait until any previous flushes are completed */
+    while (bs->flush_started_gen != bs->flushed_gen) {
Should this be an inequality, as in s/!=/</, in case several flushes can
be started in parallel and where the later flush ends up finishing
before the earlier flush?

flush_started_gen is always ahead of flushed_gen or is equal to it no matter how many requests we have in flight. using != allows us to ignore checking for unsigned overflow (which you also mention in this email).

+        qemu_co_queue_wait(&bs->flush_queue);
+    }
+    bs->flush_started_gen = current_gen;
      /* Write back all layers by calling one driver function */
      if (bs->drv->bdrv_co_flush) {
          ret = bs->drv->bdrv_co_flush(bs);
@@ -2239,6 +2247,11 @@ int coroutine_fn bdrv_co_flush(BlockDriverState *bs)
          goto flush_parent;
+ /* Check if we really need to flush anything */
+    if (bs->flushed_gen == current_gen) {
Likewise, if you are tracking generations, should this be s/==/<=/ (am I
getting the direction correct)?

Same here - '==' is so that we don't have to worry about unsigned overflow.

+++ b/include/block/block_int.h
@@ -420,6 +420,11 @@ struct BlockDriverState {
                           note this is a reference count */
      bool probed;
+ CoQueue flush_queue; /* Serializing flush queue */
+    unsigned int write_gen;         /* Current data generation */
+    unsigned int flush_started_gen; /* Generation for which flush has started 
+    unsigned int flushed_gen;       /* Flushed write generation */
Should these be 64-bit integers to avoid risk of overflow after just
2^32 flush attempts?

We don't have to care about unsigned overflow. It has a well defined behavior and we only check if generations are equal or not.

reply via email to

[Prev in Thread] Current Thread [Next in Thread]