[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH] bdrv_aio_flush

From: Ian Jackson
Subject: Re: [Qemu-devel] [PATCH] bdrv_aio_flush
Date: Mon, 1 Sep 2008 12:27:02 +0100

Andrea Arcangeli writes ("[Qemu-devel] [PATCH] bdrv_aio_flush"):
> while reading the aio/ide code I noticed the bdrv_flush operation is
> unsafe. When a write command is submitted with bdrv_aio_write and
> later bdrv_flush is called, fsync will do nothing. fsync only sees the
> kernel writeback cache. But the write command is still queued in the
> aio kernel thread and is still invisible to the kernel. bdrv_aio_flush
> will instead see both the regular bdrv_write (that submits data to the
> kernel synchronously) as well as the bdrv_aio_write as the fsync will
> be queued at the end of the aio queue and it'll be issued by the aio
> pthread thread itself.

I think this is fine.  We discussed this some time ago.  bdrv_flush
guarantees that _already completed_ IO operations are flushed.  It
does not guarantee that in flight AIO operations are completed and
then flushed to disk.

This is fine.

> IDE works by luck because it can only submit one command at once (no
> tagged queueing) so supposedly the guest kernel driver will wait the
> IDE emulated device to return ready before issuing a journaling
> barrier with WIN_FLUSH_CACHE* but with scsi and tagged command
> queueing this bug in the aio common code will become visible and it'll
> break the journaling guarantees of the guest if there's a power loss
> in the host. So it's not urgent for IDE I think, but it clearly should
> be fixed in the qemu block model eventually.

I don't think this criticism is correct because I think the IDE FLUSH
CACHE command should be read the same way.  The spec I have here is
admittedly quite unclear but I can't see any reason to think that the
`write cache' which is referred to by the spec is regarded as
containing data which has not yet been DMAd from the host to the disk
because the command which does that transfer is not yet complete.


reply via email to

[Prev in Thread] Current Thread [Next in Thread]