Re: [Qemu-devel] [PATCH 16/19] block/parallels: no need to flush on each

qemu-devel

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH 16/19] block/parallels: no need to flush on each

From:	Denis V. Lunev
Subject:	Re: [Qemu-devel] [PATCH 16/19] block/parallels: no need to flush on each block allocation table update
Date:	Tue, 13 Jan 2015 18:17:37 +0300
User-agent:	Mozilla/5.0 (Macintosh; Intel Mac OS X 10.10; rv:31.0) Gecko/20100101 Thunderbird/31.3.0

On 13/01/15 17:50, Roman Kagan wrote:

On Tue, Dec 30, 2014 at 01:07:09PM +0300, Denis V. Lunev wrote:

 From the point of guest each write to real disk prior to disk barrier
operation could be lost. Therefore there is no problem that "not synced"
new block is lost due to not updated allocation table if QEMU is crashed.

This patch improves writing performance of
   qemu-img create -f parallels -o cluster_size=64k ./1.hds 64G
   qemu-io -f parallels -c "write -P 0x11 0 1024k" 1.hds
from 45 Mb/sec to 160 Mb/sec on my SSD disk. The gain on rotational media
is much more sufficient, from 800 Kb/sec to 45 Mb/sec.

Signed-off-by: Denis V. Lunev <address@hidden>
CC: Kevin Wolf <address@hidden>
CC: Stefan Hajnoczi <address@hidden>
---
  block/parallels.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/block/parallels.c b/block/parallels.c
index ddc3aee..46cf031 100644
--- a/block/parallels.c
+++ b/block/parallels.c
@@ -194,7 +194,7 @@ static int64_t allocate_sector(BlockDriverState *bs, 
int64_t sector_num)

tmp = cpu_to_le32(s->bat[idx]);- ret = bdrv_pwrite_sync(bs->file, bat_offset(idx), &tmp, sizeof(tmp));

+    ret = bdrv_pwrite(bs->file, bat_offset(idx), &tmp, sizeof(tmp));
      if (ret < 0) {
          return ret;
      }

AFAICT this worsens the problem that existed before this patch:

if the preceding bdrv_truncate(+cluster_size) gets reordered with this
bat entry update, there's a risk that on poweroff (snapshot, etc.) the
bat entry gets written to the storage while the file size is not
updated.  As a result, the bat entry in the file would point at a
cluster which hadn't been allocated yet.  When a new block is eventually
added to the file, you'd have two bat entries pointing at the same
cluster.

The _sync version used to leave this window quite narrow due to the
flush following the write.  The patch makes the reordering more likely.

I'm afraid the only reliable way to handle it is to put a barrier
between truncate and bat update, and mitigate the costs by batching the
file expansions, as you seem to do in the followup patches.


Roman.

no, I think that this will not happen.

Here we are under the lock which was was taken in
parallels_co_writev and thus 2 contexts are impossible.
On the other hand bdrv_truncate does not delegate
the job to the other context and is performed
immediately. This from my point of view means
that truncate is performed always first and from
the safe context.

Regards,
    Den

[Prev in Thread]

Current Thread

[Next in Thread]

Re: [Qemu-devel] [PATCH 16/19] block/parallels: no need to flush on each block allocation table update, Roman Kagan, 2015/01/13
- Re: [Qemu-devel] [PATCH 16/19] block/parallels: no need to flush on each block allocation table update, Denis V. Lunev <=
  - Re: [Qemu-devel] [PATCH 16/19] block/parallels: no need to flush on each block allocation table update, Denis V. Lunev, 2015/01/13
    - Re: [Qemu-devel] [PATCH 16/19] block/parallels: no need to flush on each block allocation table update, Roman Kagan, 2015/01/14
    - Re: [Qemu-devel] [PATCH 16/19] block/parallels: no need to flush on each block allocation table update, Denis V. Lunev, 2015/01/14

Prev by Date: Re: [Qemu-devel] [RFC PATCH v7 15/21] replay: checkpoints
Next by Date: Re: [Qemu-devel] [PATCH v5 2/5] raw-posix: Refactor logical block size detection.
Previous by thread: Re: [Qemu-devel] [PATCH 16/19] block/parallels: no need to flush on each block allocation table update
Next by thread: Re: [Qemu-devel] [PATCH 16/19] block/parallels: no need to flush on each block allocation table update
Index(es):
- Date
- Thread