qemu-block
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-block] [Qemu-devel] [PATCH] [RFC v2] aio: properly bubble up e


From: Nishanth Aravamudan
Subject: Re: [Qemu-block] [Qemu-devel] [PATCH] [RFC v2] aio: properly bubble up errors from initialization
Date: Tue, 19 Jun 2018 13:14:51 -0700
User-agent: Mutt/1.9.4 (2018-02-28)

On 19.06.2018 [14:35:33 -0500], Eric Blake wrote:
> On 06/15/2018 12:47 PM, Nishanth Aravamudan via Qemu-devel wrote:
> > laio_init() can fail for a couple of reasons, which will lead to a NULL
> > pointer dereference in laio_attach_aio_context().
> > 
> > To solve this, add a aio_setup_linux_aio() function which is called
> > before aio_get_linux_aio() where it is called currently, and which
> > propogates setup errors up. The signature of aio_get_linux_aio() was not
> 
> s/propogates/propagates/
 
Thanks!

> > modified, because it seems preferable to return the actual errno from
> > the possible failing initialization calls.
> > 
> > With respect to the error-handling in the file-posix.c, we properly
> > bubble any errors up in raw_co_prw and in the case s of
> > raw_aio_{,un}plug, the result is the same as if s->use_linux_aio was not
> > set (but there is no bubbling up). In all three cases, if the setup
> > function fails, we fallback to the thread pool and an error message is
> > emitted.
> > 
> > It is trivial to make qemu segfault in my testing. Set
> > /proc/sys/fs/aio-max-nr to 0 and start a guest with
> > aio=native,cache=directsync. With this patch, the guest successfully
> > starts (but obviously isn't using native AIO). Setting aio-max-nr back
> > up to a reasonable value, AIO contexts are consumed normally.
> > 
> > Signed-off-by: Nishanth Aravamudan <address@hidden>
> > 
> > ---
> > 
> > Changes from v1 -> v2:
> 
> When posting a v2, it's best to post as a new thread, rather than
> in-reply-to the v1 thread, so that automated tooling knows to check the new
> patch.  More patch submission tips at
> https://wiki.qemu.org/Contribute/SubmitAPatch

My apologies! I'll fix this in a (future) v3.

> > Rather than affect virtio-scsi/blk at all, make all the changes internal
> > to file-posix.c. Thanks to Kevin Wolf for the suggested change.
> > ---
> >   block/file-posix.c      | 24 ++++++++++++++++++++++++
> >   block/linux-aio.c       | 15 ++++++++++-----
> >   include/block/aio.h     |  3 +++
> >   include/block/raw-aio.h |  2 +-
> >   stubs/linux-aio.c       |  2 +-
> >   util/async.c            | 15 ++++++++++++---
> >   6 files changed, 51 insertions(+), 10 deletions(-)
> > 
> > diff --git a/block/file-posix.c b/block/file-posix.c
> > index 07bb061fe4..2415d09bf1 100644
> > --- a/block/file-posix.c
> > +++ b/block/file-posix.c
> > @@ -1665,6 +1665,14 @@ static int coroutine_fn raw_co_prw(BlockDriverState 
> > *bs, uint64_t offset,
> >               type |= QEMU_AIO_MISALIGNED;
> >   #ifdef CONFIG_LINUX_AIO
> >           } else if (s->use_linux_aio) {
> > +            int rc;
> > +            rc = aio_setup_linux_aio(bdrv_get_aio_context(bs));
> > +            if (rc != 0) {
> > +                error_report("Unable to use native AIO, falling back to "
> > +                             "thread pool.");
> 
> In general, error_report() should not output a trailing '.'.

Will fix.

> > +                s->use_linux_aio = 0;
> > +                return rc;
> 
> Wait - the message claims we are falling back, but the non-zero return code
> sounds like we are returning an error instead of falling back.  (My
> preference - if the user requested something and we can't do it, it's better
> to error than to fall back to something that does not match the user's
> request).

I think that makes sense, I hadn't tested this specific case (in my
reading of the code, it wasn't clear to me if raw_co_prw() could be
called before raw_aio_plug() had been called, but I think returning the
error code up should be handled correctly. What about the cases where
there is no error handling (the other two changes in the patch)?

> > +            }
> >               LinuxAioState *aio = 
> > aio_get_linux_aio(bdrv_get_aio_context(bs));
> >               assert(qiov->size == bytes);
> >               return laio_co_submit(bs, aio, s->fd, offset, qiov, type);
> > @@ -1695,6 +1703,14 @@ static void raw_aio_plug(BlockDriverState *bs)
> >   #ifdef CONFIG_LINUX_AIO
> >       BDRVRawState *s = bs->opaque;
> >       if (s->use_linux_aio) {
> > +        int rc;
> > +        rc = aio_setup_linux_aio(bdrv_get_aio_context(bs));
> > +        if (rc != 0) {
> > +            error_report("Unable to use native AIO, falling back to "
> > +                         "thread pool.");
> > +            s->use_linux_aio = 0;
> 
> Should s->use_linux_aio be a bool instead of an int?

It is:

    bool use_linux_aio:1;

would you prefer I did a preparatory patch that converted users to
true/false?

Thanks,
Nish



reply via email to

[Prev in Thread] Current Thread [Next in Thread]