[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Qemu-devel] Re: Notes on block I/O data integrity

From: Christoph Hellwig
Subject: [Qemu-devel] Re: Notes on block I/O data integrity
Date: Thu, 27 Aug 2009 15:42:39 +0200
User-agent: Mutt/1.3.28i

On Thu, Aug 27, 2009 at 08:21:55PM +0930, Rusty Russell wrote:
> >  - virtio-blk needs to advertise ordered queue by default.
> >    This makes cache=writethrough safe on virtio.
> >From a guest POV, that's "we don't know, let's say we're ordered because that
> may make us safer".  Of course, it may not help: how much does it cost to
> drain the queue?
> The bug, IMHO is that we *should* know.  And in future I'd like to fix that,
> either by adding an VIRTIO_BLK_F_ORDERED feature, or a VIRTIO_BLK_F_UNORDERED
> feature.
> > Action plan for QEMU:
> > 
> >  - IDE needs to set the write cache enabled bit
> >  - virtio needs to implement a cache flush command and advertise it
> >    (also needs a small change to the host driver)
> So, virtio-blk needs to be enhanced for this as well.

Really, enabling volatile write caches without advertising a cache flush
command is a bug in the storage, where in our case qemu is the storage.
So I don't really see the need for two feature bits.  Here's my plan for

 - add a new VIRTIO_BLK_F_WCACHE feature.  If this feature is set we
     (a) implement the prepare_flush queue operation to send a
         standalone cache flush
     (b) set a proper barrier ordering flag on the queue

        Now I'm not entirely sure which queue ordering feature we will
        use.  It is not going to be QUEUE_ORDERED_TAG as for
        VIRTIO_BLK_F_BARRIER as that leaves all the queue draining to
        the host.  Which for everything that uses something resembling
        Posix I/O as a backed and has more than one outstanding command
        at a time just means duplicating all the queue management we
        already do in the guest for no gain.
        The easiest one would be QUEUE_ORDERED_DRAIN_FLUSH, in which
        case the cache flush command really is everything we need.
        As a slight optimization of it we could make it
        QUEUE_ORDERED_DRAIN_FUA which still does all the queue draining
        in the guest, but only sends one explicit cache flush before the
        barrier and gthen sets the FUA bit on the actual barrier
        request.  In qemu we still would implement this as fdatasync
        before and after the request, but we would save one protocol

Now the big question is when do we set the VIRTIO_BLK_F_WCACHE feature.
The proper thing to do would be to set it for cache=writeback and
cache=none, because they do need the fdatasync, and not for
cache=writethrough because it does not require it.

Now Avi is a big advocate for the cache=writethrough should mean go fast
and loose and don't care about data integrity.  There's a certain point
to that as I don't really see a good use case for that mode, but I
really hate to make something unsafe that doesn't explicitly say so
in the option name.

The complex (not to say over engineered) verison would be to split
the caching and data integrity setting into two options:

 (1) hostcache=on|off
        use buffered vs O_DIRECT I/O
 (2) integrity=osync|fsync|none
        use O_SYNC, use f(data)sync or do not care about data integrity

reply via email to

[Prev in Thread] Current Thread [Next in Thread]