[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] [RFC] qcow2 journalling draft
From: |
Stefan Hajnoczi |
Subject: |
Re: [Qemu-devel] [RFC] qcow2 journalling draft |
Date: |
Thu, 5 Sep 2013 11:35:43 +0200 |
User-agent: |
Mutt/1.5.21 (2010-09-15) |
On Tue, Sep 03, 2013 at 03:45:52PM +0200, Kevin Wolf wrote:
> This contains an extension of the qcow2 spec that introduces journalling
> to the image format, plus some preliminary type definitions and
> function prototypes in the qcow2 code.
>
> Journalling functionality is a crucial feature for the design of data
> deduplication, and it will improve the core part of qcow2 by avoiding
> cluster leaks on crashes as well as provide an easier way to get a
> reliable implementation of performance features like Delayed COW.
>
> At this point of the RFC, it would be most important to review the
> on-disk structure. Once we're confident that it can do everything we
> want, we can start going into more detail on the qemu side of things.
>
> Signed-off-by: Kevin Wolf <address@hidden>
> ---
> block/Makefile.objs | 2 +-
> block/qcow2-journal.c | 55 ++++++++++++++
> block/qcow2.h | 78 +++++++++++++++++++
> docs/specs/qcow2.txt | 204
> +++++++++++++++++++++++++++++++++++++++++++++++++-
> 4 files changed, 337 insertions(+), 2 deletions(-)
> create mode 100644 block/qcow2-journal.c
Although we are still discussing details of the on-disk layout, the
general design is clear enough to discuss how the journal will be used.
Today qcow2 uses Qcow2Cache to do lazy, ordered metadata updates. The
performance is pretty good with two exceptions that I can think of:
1. The delayed CoW problem that Kevin has been working on. Guests
perform sequential writes that are smaller than a qcow2 cluster. The
first write triggers a copy-on-write of the full cluster. Later
writes then overwrite the copied data. It would be more efficient to
anticipate sequential writes and hold off on CoW where possible.
2. Lazy metadata updates lead to bursty behavior and expensive flushes.
We do not take advantage of disk bandwidth since metadata updates
stay in the Qcow2Cache until the last possible second. When the
guest issues a flush we must write out dirty Qcow2Cache entries and
possibly fsync between them if dependencies have been set (e.g.
refcount before L2).
How will the journal change this situation? Writes that go through the
journal are doubled - they must first be journalled, fsync, and then
they can be applied to the actual image.
How do we benefit by using the journal?
Stefan
- Re: [Qemu-devel] [RFC] qcow2 journalling draft, (continued)
- Re: [Qemu-devel] [RFC] qcow2 journalling draft, Stefan Hajnoczi, 2013/09/05
- Re: [Qemu-devel] [RFC] qcow2 journalling draft, Kevin Wolf, 2013/09/05
- Re: [Qemu-devel] [RFC] qcow2 journalling draft, Stefan Hajnoczi, 2013/09/05
- Re: [Qemu-devel] [RFC] qcow2 journalling draft, Kevin Wolf, 2013/09/05
- Re: [Qemu-devel] [RFC] qcow2 journalling draft, Eric Blake, 2013/09/05
- Re: [Qemu-devel] [RFC] qcow2 journalling draft, Fam Zheng, 2013/09/06
- Re: [Qemu-devel] [RFC] qcow2 journalling draft, Kevin Wolf, 2013/09/06
- Re: [Qemu-devel] [RFC] qcow2 journalling draft, Fam Zheng, 2013/09/06
Re: [Qemu-devel] [RFC] qcow2 journalling draft, Max Reitz, 2013/09/04
Re: [Qemu-devel] [RFC] qcow2 journalling draft,
Stefan Hajnoczi <=
Re: [Qemu-devel] [RFC] qcow2 journalling draft, Fam Zheng, 2013/09/06