qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Qemu-devel] Re: [PATCH v2 6/7] qed: Read/write support


From: Stefan Hajnoczi
Subject: [Qemu-devel] Re: [PATCH v2 6/7] qed: Read/write support
Date: Wed, 13 Oct 2010 13:13:29 +0100
User-agent: Mutt/1.5.20 (2009-06-14)

On Tue, Oct 12, 2010 at 11:16:17AM -0500, Anthony Liguori wrote:
> On 10/12/2010 10:59 AM, Stefan Hajnoczi wrote:
> >On Tue, Oct 12, 2010 at 05:39:48PM +0200, Kevin Wolf wrote:
> >>Am 12.10.2010 17:22, schrieb Anthony Liguori:
> >>>On 10/12/2010 10:08 AM, Kevin Wolf wrote:
> >>>>   Otherwise we might destroy data that isn't
> >>>>even touched by the guest request in case of a crash.
> >>>>
> >>>The failure scenarios are either that the cluster is leaked in which
> >>>case, the old version of the data is still present or the cluster is
> >>>orphaned because the L2 entry is written, in which case the old version
> >>>of the data is present.
> >>Hm, how does the latter case work? Or rather, what do mean by "orphaned"?
> >>
> >>>Are you referring to a scenario where the cluster is partially written
> >>>because the data is present in the write cache and the write cache isn't
> >>>flushed on power failure?
> >>The case I'm referring to is a COW. So let's assume a partial write to
> >>an unallocated cluster, we then need to do a COW in pre/postfill. Then
> >>we do a normal write and link the new cluster in the L2 table.
> >>
> >>Assume that the write to the L2 table is already on the disk, but the
> >>pre/postfill data isn't yet. At this point we have a bad state because
> >>if we crash now we have lost the data that should have been copied from
> >>the backing file.
> >In this case QED_F_NEED_CHECK is set and the invalid cluster offset
> >should be reset to zero on open.
> >
> >However, I think we can get into a state where the pre/postfill data
> >isn't on the disk yet but another allocation has increased the file
> >size, making the unwritten cluster "valid".  This fools consistency
> >check into thinking the data cluster (which was never written to on
> >disk) is valid.
> >
> >Will think about this more tonight.
> 
> It's fairly simple to add a sync to this path.  It's probably worth
> checking the prefill/postfill for zeros and avoiding the write/sync
> if that's the case.  That should optimize the common cases of
> allocating new space within a file.
> 
> My intuition is that we can avoid the sync entirely but we'll need
> to think about it further.

We can avoid it when a backing image is not used.  Your idea to check
for zeroes in the backing image is neat too, it may well reduce the
common case even for backing images.

Stefan



reply via email to

[Prev in Thread] Current Thread [Next in Thread]