qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH 1/1] coroutine-lock: do not touch coroutine afte


From: Paolo Bonzini
Subject: Re: [Qemu-devel] [PATCH 1/1] coroutine-lock: do not touch coroutine after another one has been entered
Date: Tue, 30 May 2017 11:35:06 +0200
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.1.0


On 30/05/2017 10:53, Roman Pen wrote:
> The fix is obvious: use temporal queue and do not touch coroutine after
> first qemu_coroutine_enter() is invoked.
> 
> The issue is quite rare and happens every ~12 hours on very high IO load
> (building linux kernel with -j512 inside guest) when IO throttling is
> enabled.  With the fix applied guest is running ~35 hours and is still
> alive so far.

This is a good one.  Thanks!  Just a couple notes on English below.

> diff --git a/util/qemu-coroutine-lock.c b/util/qemu-coroutine-lock.c
> index 6328eed26bc6..026f5297f0f9 100644
> --- a/util/qemu-coroutine-lock.c
> +++ b/util/qemu-coroutine-lock.c
> @@ -77,10 +77,23 @@ void coroutine_fn qemu_co_queue_wait(CoQueue *queue, 
> CoMutex *mutex)
>  void qemu_co_queue_run_restart(Coroutine *co)
>  {
>      Coroutine *next;
> +    QSIMPLEQ_HEAD(, Coroutine) tmp_queue_wakeup =
> +        QSIMPLEQ_HEAD_INITIALIZER(tmp_queue_wakeup);
>  
>      trace_qemu_co_queue_run_restart(co);
> -    while ((next = QSIMPLEQ_FIRST(&co->co_queue_wakeup))) {
> -        QSIMPLEQ_REMOVE_HEAD(&co->co_queue_wakeup, co_queue_next);
> +
> +    /*
> +     * We use temporal queue in order not to touch 'co' after entering
> +     * 'next' coroutine.  The thing is that 'next' coroutine can resume
> +     * current 'co' coroutine and eventually terminate (free) it (see
> +     * linux-aio.c: ioq_submit() where qemu_laio_process_completions()
> +     * is invoked).  The rule of thumb is simple: do not touch coroutine
> +     * when you enter another one.
> +     */

/* Because "co" has yielded, any coroutine that we wakeup can resume it.
 * If this happens and "co" terminates, co->co_queue_wakeup becomes
 * invalid memory.  Therefore, use a temporary queue and do not touch
 * the "co" coroutine as soon as you enter another one.
 */

> +    QSIMPLEQ_CONCAT(&tmp_queue_wakeup, &co->co_queue_wakeup);
> +
> +    while ((next = QSIMPLEQ_FIRST(&tmp_queue_wakeup))) {
> +        QSIMPLEQ_REMOVE_HEAD(&tmp_queue_wakeup, co_queue_next);
>          qemu_coroutine_enter(next);
>      }
>  }
> diff --git a/util/qemu-coroutine.c b/util/qemu-coroutine.c
> index 486af9a62275..d020b63742af 100644
> --- a/util/qemu-coroutine.c
> +++ b/util/qemu-coroutine.c
> @@ -126,6 +126,13 @@ void qemu_aio_coroutine_enter(AioContext *ctx, Coroutine 
> *co)
>  
>      qemu_co_queue_run_restart(co);
>  
> +    /*
> +     * BEWARE: in case of ret == COROUTINE_YIELD here at this point
> +     *         after qemu_co_queue_run_restart() 'co' can be already
> +     *         freed by other coroutine, which has entered 'co'. So
> +     *         be careful and do not touch it.
> +     */

/* Beware, if ret == COROUTINE_YIELD and qemu_co_queue_run_restart()
 * has started any other coroutine, "co" might have been reentered
 * and even freed by now!  So be careful and do not touch it.
 */

>      switch (ret) {
>      case COROUTINE_YIELD:
>          return;
> 

Paolo



reply via email to

[Prev in Thread] Current Thread [Next in Thread]