qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [RFC for 2.9 1/1] block: add missed aio_context_acquire


From: Max Reitz
Subject: Re: [Qemu-devel] [RFC for 2.9 1/1] block: add missed aio_context_acquire into blk_unref
Date: Mon, 27 Mar 2017 17:45:37 +0200
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.8.0

@Subject: Do you mean "PATCH for-2.9?"? Because "RFC" to me means
"please do not merge". ;-)

I wouldn't mind a change like this to go into 2.9.

On 27.03.2017 17:35, Denis V. Lunev wrote:
> Recently we expirience hang with iothreads enabled with the following
> call trace:
> Thread 1 (Thread 0x7fa95efebc80 (LWP 177117)):
> 0  ppoll () from /lib64/libc.so.6
> 2  qemu_poll_ns () at qemu-timer.c:313
> 3  aio_poll () at aio-posix.c:457
> 4  bdrv_flush () at block/io.c:2641
> 5  bdrv_close () at block.c:2143
> 6  bdrv_delete () at block.c:2352
> 7  bdrv_unref () at block.c:3429
> 8  blk_remove_bs () at block/block-backend.c:427
> 9  blk_delete () at block/block-backend.c:178
> 10 blk_unref () at block/block-backend.c:226
> 11 object_property_del_all () at qom/object.c:399
> 12 object_finalize () at qom/object.c:461
> 13 object_unref () at qom/object.c:898
> 14 object_property_del_child () at qom/object.c:422
> 15 qmp_marshal_device_del () at qmp-marshal.c:1145
> 16 handle_qmp_command () at /usr/src/debug/qemu-2.6.0/monitor.c:3929
> 
> Technically bdrv_flush() stucks in
>     while (rwco.ret == NOT_DONE) {
>         aio_poll(aio_context, true);
>     }
> but rwco.ret is equal to 0 thus we have missed wakeup. Code investigation
> reveals that we do not have performed aio_context_acquire() on this call
> stack.
> 
> This patch adds missed lock.
> 
> Signed-off-by: Denis V. Lunev <address@hidden>
> CC: Kevin Wolf <address@hidden>
> CC: Max Reitz <address@hidden>
> CC: Eric Blake <address@hidden>
> CC: Markus Armbruster <address@hidden>
> ---
>  block/block-backend.c | 4 ++++
>  1 file changed, 4 insertions(+)
> 
> diff --git a/block/block-backend.c b/block/block-backend.c
> index 5742c09..65d5da9 100644
> --- a/block/block-backend.c
> +++ b/block/block-backend.c
> @@ -273,7 +273,11 @@ void blk_unref(BlockBackend *blk)
>      if (blk) {
>          assert(blk->refcnt > 0);
>          if (!--blk->refcnt) {
> +            AioContext *ctx = blk_get_aio_context(blk);
> +
> +            aio_context_acquire(ctx);
>              blk_delete(blk);
> +            aio_context_release(ctx);

But I don't think this is quite the correct place. The caller of
blk_unref() should have acquired the AioContext already. As far as I can
tell in this case that would be originally release_drive() in
hw/core/qdev-properties-system.c and then blk_detach_dev().

I think the former would be the somehow more correct place but I can
imagine the latter to be more useful in reality. I'll leave it to you.

(As an alternative, you may of course convince me that this patch is
indeed correct and should be taken as-is. :-))

Max

>          }
>      }
>  }
> 


Attachment: signature.asc
Description: OpenPGP digital signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]