[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-block] [PATCH 5/7] block/mirror: utilize job_exit shim

From: John Snow
Subject: Re: [Qemu-block] [PATCH 5/7] block/mirror: utilize job_exit shim
Date: Tue, 28 Aug 2018 16:25:40 -0400
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.8.0

On 08/25/2018 11:02 AM, Max Reitz wrote:
> If you say so...  I have to admit I don't really understand.  The
> comment doesn't explain why it's so important to keep src around until
> job_completed(), so I don't know.  I thought AioContexts are recursive
> so it doesn't matter whether you take them recursively or not.

bdrv_flush has troubles under a recursive lock if it is invoked from a
different thread. It tries to poll for flush completion but the
coroutine which gets scheduled (instead of entered) can't actually run
if we hold the lock twice from, say, the main thread -- which is where
we're doing the graph manipulation from.

>        co = qemu_coroutine_create(bdrv_flush_co_entry, &flush_co);
>        bdrv_coroutine_enter(bs, co);
>        BDRV_POLL_WHILE(bs, flush_co.ret == NOT_DONE);

BDRV_POLL_WHILE there causes us the grief via AIO_WAIT_WHILE, which only
puts down one reference, so we deadlock in bdrv_flush in a recursive

Kevin helped me figure it out; I CC'd him off-list on a mail that you
were also CC'd on ("hang in mirror_exit") that's probably pretty helpful:

> Took a little more than five minutes, but I think I've got it now. The
> important thing is that the test case is for dataplane, i.e. the job
> runs in an I/O thread AioContext. Job completion, however, happens in
> the main loop thread.
> Therefore, when bdrv_delete() wants to flush the node first, it needs to
> run bdrv_co_flush() in a foreign AioContext, so the coroutine is
> scheduled. The I/O thread backtrace shows that it's waiting for the
> AioContext lock before it can actually enter the bdrv_co_flush()
> coroutine, so we must have deadlocked:


reply via email to

[Prev in Thread] Current Thread [Next in Thread]