[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH] Asynchronous reliable and configurable cache fl

From: Ian Jackson
Subject: Re: [Qemu-devel] [PATCH] Asynchronous reliable and configurable cache flush
Date: Thu, 3 Apr 2008 13:06:50 +0100

Jamie Lokier writes ("Re: [Qemu-devel] [PATCH] Asynchronous reliable and 
configurable cache flush"):
> Ian Jackson wrote:
> > Doing it with bdrv_aio_flush wasn't too hard, so that's what'll be in
> > my revised patch shortly.
> For "uncached" writes, do you wait until _after_ the aio_write has
> completed before calling aio_fsync, or do you assume the aio_fsync
> will wait for all aio_writes queued (but not completed) prior to it,
> as Marcelo Tosatti believes the documentation implies?

I do the former.  In general, I assume that in the following case:

     returns 0, operation is in progress
       .                      aio_fsync
       .                       returns 0, operation is in progress
       .                          .
   aio_return on the aio_write    .
    says completed OK             .
                              aio_return on the aio_fsync
                               says completed OK

the aio_f(data)sync completion does not tell us that the write (which
had not been completed at the time aio_fsync was called) has been

I don't agree with Marcelo's analysis.

I think SuSv3 unhelpfully uses the word `queued' in two different
ways: firstly, to refer to data for which a write(2) or equivalent has
completed but where the data has not yet reached stable storage, and
secondly to refer to asynchronous IO operations (including aio_write
and aio_fsync) which have been submitted but not yet completed.

The former usage is demonstrated clearly in the SuSv3 page for
write(2) which describes the data as having been `queued' when
write(2) completes - ie when it is in the buffer cache.  This
interpretation for write(2) is supported by the page for read(2) not
mentioning queues (except for STREAMS queues which are irrelevant).

The latter usage is demonstrated by the spec for aio_read, which talks
about a read having been queued, and even for aio_fsync which is
specified to return when `the synchronisation request has been
... queued'.

Clearly it is the former meaning of `queue' which is referred to by
the specification of fsync and fdatasync since those calls apply to
data from write(2) as well.

In the spec for aio_write, `queued' is used in the first paragraph
with an identical wording to that for aio_read, so has the latter of
the two meanings above.

Sadly in aio_fsync these two meanings of `queued' appear in different
parts of the page.  Firstly, we have the statement that aio_fsync
shall return when the synchronisation request has been queued
(identical wording to that used for aio_read and aio_write).  Then in
the second paragraph we refer to queued IO operations but this can
only refer to the former, fsync, sense (since the text explicitly says
`as if by a call to fsync and since otherwise the reference would be
semi-recursive as the aio_fsync would itself be a queued IO

If one thinks that both of these meanings of `queue' are the same:
what is the difference between aio_write and write(2) ?  Why can't
write(2) return immediately just like aio_write ?


reply via email to

[Prev in Thread] Current Thread [Next in Thread]