qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Fwd: [Qemu-devel] bdrv_flush for qemu block drivers nbd, rbd and she


From: Sage Weil
Subject: Re: Fwd: [Qemu-devel] bdrv_flush for qemu block drivers nbd, rbd and sheepdog
Date: Fri, 22 Oct 2010 09:22:00 -0700 (PDT)

On Fri, 22 Oct 2010, Kevin Wolf wrote:
> [ Adding qemu-devel to CC again ]
> 
> Am 21.10.2010 20:59, schrieb Sage Weil:
> > On Thu, 21 Oct 2010, Christian Brunner wrote:
> >> Hi,
> >>
> >> is there a flush operation in librados? - I guess the only way to
> >> handle this, would be waiting until all aio requests are finished?
> 
> That's not the semantics of bdrv_flush, you don't need to wait for
> running requests. You just need to make sure that all completed requests
> are safe on disk so that they would persist even in case of a
> crash/power failure.

Okay, in that case we're fine.  librados doesn't declare a write committed 
until it is safely on disk on multiple backend nodes.  There is a 
mechanism to get an ack sooner, but the qemu storage driver does not use 
it.  

> > There is no flush currently.  But librados does no caching, so in this 
> > case at least silenting upgrading to cache=writethrough should work.
> 
> You're making sure that the data can't be cached in the server's page
> cache or volatile disk cache either, e.g. by using O_SYNC for the image
> file? If so, upgrading would be safe.

Right.

> > If that's a problem, we can implement a flush.  Just let us know.
> 
> Presumably providing a writeback mode with explicit flushes could
> improve performance. Upgrading to writethrough is not a correctness
> problem, though, so it's your decision if you want to implement it.

So is a bdrv_flush generated when e.g. the guest filesystem issues a 
barrier, or would otherwise normally ask a SATA disk to flush it's cache?

sage



> Kevin
> 
> >> ---------- Forwarded message ----------
> >> From: Kevin Wolf <address@hidden>
> >> Date: 2010/10/21
> >> Subject: [Qemu-devel] bdrv_flush for qemu block drivers nbd, rbd and 
> >> sheepdog
> >> To: Christian Brunner <address@hidden>, Laurent Vivier
> >> <address@hidden>, MORITA Kazutaka <address@hidden>
> >> Cc: address@hidden
> >>
> >>
> >> Hi all,
> >>
> >> I'm currently looking into adding a return value to qemu's bdrv_flush
> >> function and I noticed that your block drivers (nbd, rbd and sheepdog)
> >> don't implement bdrv_flush at all. bdrv_flush is going to return
> >> -ENOTSUP for any block driver not implementing this, effectively
> >> breaking these three drivers for anything but cache=unsafe.
> >>
> >> Is there a specific reason why your drivers don't implement this? I
> >> think I remember that one of the drivers always provides
> >> cache=writethough semantics. It would be okay to silently "upgrade" to
> >> cache=writethrough, so in this case I'd just need to add an empty
> >> bdrv_flush implementation.
> >>
> >> Otherwise, we really cannot allow any option except cache=unsafe because
> >> that's the semantics provided by the driver.
> >>
> >> In any case, I think it would be a good idea to implement a real
> >> bdrv_flush function to allow the write-back cache modes cache=off and
> >> cache=writeback in order to improve performance over writethrough.
> >>
> >> Is this possible with your protocols, or can the protocol be changed to
> >> consider this? Any hints on how to proceed?
> >>
> >> Kevin
> >> --
> >> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> >> the body of a message to address@hidden
> >> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> >>
> >>
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to address@hidden
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
> 



reply via email to

[Prev in Thread] Current Thread [Next in Thread]