qemu-block
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-block] [Qemu-devel] [PATCH] [RFC v2] aio: properly bubble up e


From: Nishanth Aravamudan
Subject: Re: [Qemu-block] [Qemu-devel] [PATCH] [RFC v2] aio: properly bubble up errors from initialization
Date: Tue, 19 Jun 2018 15:35:57 -0700
User-agent: Mutt/1.9.4 (2018-02-28)

On 19.06.2018 [13:14:51 -0700], Nishanth Aravamudan wrote:
> On 19.06.2018 [14:35:33 -0500], Eric Blake wrote:
> > On 06/15/2018 12:47 PM, Nishanth Aravamudan via Qemu-devel wrote:

<snip>

> > >           } else if (s->use_linux_aio) {
> > > +            int rc;
> > > +            rc = aio_setup_linux_aio(bdrv_get_aio_context(bs));
> > > +            if (rc != 0) {
> > > +                error_report("Unable to use native AIO, falling back to "
> > > +                             "thread pool.");
> > 
> > In general, error_report() should not output a trailing '.'.
> 
> Will fix.
> 
> > > +                s->use_linux_aio = 0;
> > > +                return rc;
> > 
> > Wait - the message claims we are falling back, but the non-zero return code
> > sounds like we are returning an error instead of falling back.  (My
> > preference - if the user requested something and we can't do it, it's better
> > to error than to fall back to something that does not match the user's
> > request).
> 
> I think that makes sense, I hadn't tested this specific case (in my
> reading of the code, it wasn't clear to me if raw_co_prw() could be
> called before raw_aio_plug() had been called, but I think returning the
> error code up should be handled correctly. What about the cases where
> there is no error handling (the other two changes in the patch)?

While looking at doing these changes, I realized that I'm not quite sure
what the right approach is here. My original rationale for returning
non-zero was that AIO was requested but could not be completed. I
haven't fully tracked back the calling paths, but I assumed it would get
retried at the top level, and since we indicated to not use AIO on
subsequent calls, it will succeed and use threads then (note, that I do
now realize this means a mismatch between the qemu command-line and the
in-use AIO model).

In practice, with my v2 patch, where I do return a non-zero error-code
from this function, qemu does not exit (nor is any logging other than
that I added emitted on the monitor). If I do not fallback, I imagine we
would just continuously see this error message and IO might not actually
every occur? Reworking all of the callpath to fail on non-zero returns
from raw_co_prw() seems like a fair bit of work, but if that is what is
being requested, I can try that (it will just take a while).
Alternatively, I can produce a v3 quickly that does not bubble the
actual errno all the way up (since it does seem like it is ignored
anyways?).

<snip>

> > > +            s->use_linux_aio = 0;
> > 
> > Should s->use_linux_aio be a bool instead of an int?
> 
> It is:
> 
>     bool use_linux_aio:1;
> 
> would you prefer I did a preparatory patch that converted users to
> true/false?

Sorry, I misunderstood this -- only my patch does an assignment, so I'll
switch to 'false'.

Thanks,
Nish



reply via email to

[Prev in Thread] Current Thread [Next in Thread]