[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] [PATCH block-next 0/3] qemu-img check/qcow2: Allow fixi
Re: [Qemu-devel] [PATCH block-next 0/3] qemu-img check/qcow2: Allow fixing refcounts
Fri, 1 Jun 2012 09:06:32 +0100
On Fri, Jun 1, 2012 at 6:22 AM, Zhi Yong Wu <address@hidden> wrote:
> On Thu, May 31, 2012 at 5:26 PM, Stefan Hajnoczi <address@hidden> wrote:
>> On Wed, May 30, 2012 at 9:31 AM, Zhi Yong Wu <address@hidden> wrote:
>>> On Sat, May 12, 2012 at 12:48 AM, Kevin Wolf <address@hidden> wrote:
>>>> A prerequisite for a "QED mode" in qcow2, which doesn't update the refcount
>>> Recently some new concepts such as "QED mode" in qcow2 are seen
>>> frequencely, can anyone explain what it means? thanks.
>> qcow2 has more metadata than qed. More metadata means more write
>> operations when allocating new clusters.
>> In order to overcome this performance issue qcow2 has a metadata
>> cache. But when QEMU is launched with -drive ...,cache=writethrough
>> (the default) the metadata cache *must* be in writethrough mode
> Why must i be? If the option with -drive ..,cache=writethrough is
> specified. it means that host page cache is on while guest disk cache
> is off. Since the metadata cache exists in host page cache, not guest,
> i think that it is in writeback mode.
Since the emulated disk write cache is off, we must ensure that guest
writes are on disk before completing them. Therefore we cannot cache
metadata updates in host RAM - it would be lost on power failure but
we promised the guest its writes reached the disk!
>> instead of writeback mode. In other words, every metadata update
>> needs to be written to the image file before we complete the guest's
> What will mean one guest's wirte request is completed?
For example, virtio-blk fills in the success status code and raises an
interrupt. This notifies the guest that the write is done.
>> write request. This means the metadata cache only hides the metadata
>> performance issue when -drive ...,cache=direct|writeback are used
>> because there we can keep metadata changes buffered in memory until
>> the guest flushes the emulated disk write cache.
>> "QED mode" is a solution for -drive ...,cache=writethrough|directsync.
>> It simply doesn't update refcount metadata in the qcow2 image file
>> immediately in exchange for a refcount fixup step that is introduced
> Can you say this with more details? Why is this step need only when
> image file is opened? After image file is opened, and some guest's
> write requests are completed, maybe the refcount fixup step need to be
> done once.
If we don't update refcounts on disk then they become outdated and no
longer reflect the true allocation information. It's not safe to rely
on outdated refcount information since we could allocate the same
cluster multiple times - this means data corruption. By running a
consistency check when opening a dirty image file we guarantee that we
have accurate refcount information again.
As an optimization we will commit refcount information to disk when
closing the image file and mark it clean. This means a clean QEMU
shutdown does not require a consistency check on startup - but in the
worst case (power failure or crash) we will have a dirty image file.