[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-block] [PATCH] util/async: use qemu_aio_coroutine_enter in co_
From: |
Paolo Bonzini |
Subject: |
Re: [Qemu-block] [PATCH] util/async: use qemu_aio_coroutine_enter in co_schedule_bh_cb |
Date: |
Thu, 13 Sep 2018 16:58:17 +0200 |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.9.1 |
On 05/09/2018 11:33, Sergio Lopez wrote:
> AIO Coroutines shouldn't by managed by an AioContext different than the
> one assigned when they are created. aio_co_enter avoids entering a
> coroutine from a different AioContext, calling aio_co_schedule instead.
>
> Scheduled coroutines are then entered by co_schedule_bh_cb using
> qemu_coroutine_enter, which just calls qemu_aio_coroutine_enter with the
> current AioContext obtained with qemu_get_current_aio_context.
> Eventually, co->ctx will be set to the AioContext passed as an argument
> to qemu_aio_coroutine_enter.
>
> This means that, if an IO Thread's AioConext is being processed by the
> Main Thread (due to aio_poll being called with a BDS AioContext, as it
> happens in AIO_WAIT_WHILE among other places), the AioContext from some
> coroutines may be wrongly replaced with the one from the Main Thread.
>
> This is the root cause behind some crashes, mainly triggered by the
> drain code at block/io.c. The most common are these abort and failed
> assertion:
>
> util/async.c:aio_co_schedule
> 456 if (scheduled) {
> 457 fprintf(stderr,
> 458 "%s: Co-routine was already scheduled in '%s'\n",
> 459 __func__, scheduled);
> 460 abort();
> 461 }
>
> util/qemu-coroutine-lock.c:
> 286 assert(mutex->holder == self);
>
> But it's also known to cause random errors at different locations, and
> even SIGSEGV with broken coroutine backtraces.
>
> By using qemu_aio_coroutine_enter directly in co_schedule_bh_cb, we can
> pass the correct AioContext as an argument, making sure co->ctx is not
> wrongly altered.
>
> Signed-off-by: Sergio Lopez <address@hidden>
> ---
> util/async.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/util/async.c b/util/async.c
> index 05979f8014..c10642a385 100644
> --- a/util/async.c
> +++ b/util/async.c
> @@ -400,7 +400,7 @@ static void co_schedule_bh_cb(void *opaque)
>
> /* Protected by write barrier in qemu_aio_coroutine_enter */
> atomic_set(&co->scheduled, NULL);
> - qemu_coroutine_enter(co);
> + qemu_aio_coroutine_enter(ctx, co);
> aio_context_release(ctx);
> }
> }
>
Ouch.
Reviewed-by: Paolo Bonzini <address@hidden>
Paolo