qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH 2/3] barriers: block-raw-posix barrier support


From: Jamie Lokier
Subject: Re: [Qemu-devel] [PATCH 2/3] barriers: block-raw-posix barrier support
Date: Tue, 5 May 2009 17:00:11 +0100
User-agent: Mutt/1.5.13 (2006-08-11)

Christoph Hellwig wrote:
> On Tue, May 05, 2009 at 01:33:11PM +0100, Jamie Lokier wrote:
> > You don't need two fdatasyncs if the barrier request is just a
> > barrier, no data write, used only to flush previously written data by
> > a guest's fsync/fdatasync implementation.
> 
> Yeah.  I'll put that optimization in after some testing.

I suggest keeping a flag "flush_needed".  Set it whenever a write is
submitted, don't submit fsync/fdatasync when the flag is clear, clear
it whenever an fsync/fdatasync is submitted.  Provides a few more
optimisation opportunities.

> > This is the best argument yet for having distinct "barrier" and "sync"
> > operations.  "Barrier" is for ordering I/O, such as journalling
> > filesystems.
> 
> Doesn't really help as long as we're using the normal Posix filesystem
> APIs on the host.  The only way to guarantee ordering of multiple
> *write* systen calls is to call f(data)sync between them.

It doesn't help with journalling barriers, which I agree are dominant
in a lot of workloads, but it does help guest fsync-heavy workloads.

When "Sync && !Barrier" the guest doesn't require the full ordering
guarantee.

Therefore you can call f(data)sync _and_ call some writes on other I/O
threads in parallel.  The f(data)sync) mustn't be started until
previous-queued writes are complete, but later-queued writes can be
called in parallel with f(data)sync.

(Or if using Linux AIO, the same with aio_fsync and later-queued
aio_writes in parallel).

In other words, with a guest fdatasync-heavy workload, like a
database, it could keep the I/O pipeline busy instead of draining it
as the full barrier does.

It won't help with a journalling-barrier-heavy workload, without
changes to the host to expose the distinct barrier types - i.e. a more
flexible alternative to f(data)sync, such as is occasionally discussed
elsewhere.

-- Jamie




reply via email to

[Prev in Thread] Current Thread [Next in Thread]