qemu-block
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH 1/6] block/io.c: fix bdrv_child_cb_drained_begin invocations


From: Emanuele Giuseppe Esposito
Subject: Re: [PATCH 1/6] block/io.c: fix bdrv_child_cb_drained_begin invocations from a coroutine
Date: Mon, 14 Feb 2022 11:27:07 +0100
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.2.0


On 11/02/2022 12:54, Kevin Wolf wrote:
> Am 08.02.2022 um 16:36 hat Emanuele Giuseppe Esposito geschrieben:
>> Using bdrv_do_drained_begin_quiesce() in bdrv_child_cb_drained_begin()
>> is not a good idea: the callback might be called when running
>> a drain in a coroutine, and bdrv_drained_begin_poll() does not
>> handle that case, resulting in assertion failure.
> 
> I remembered that we talked about this only recently on IRC, but it
> didn't make any sense to me again when I read this commit message. So I
> think we need --verbose.
> 
> The .drained_begin callback was always meant to run outside of coroutine
> context, so the unexpected part isn't that it calls a function that
> can't run in coroutine context, but that it is already called itself in
> coroutine context.
> 
> The problematic path is bdrv_replace_child_noperm() which then calls
> bdrv_parent_drained_begin_single(poll=true). Polling in coroutine
> context is dangerous, it can cause deadlocks because the caller of the
> coroutine can't make progress. So I believe this call is already wrong
> in coroutine context.

Ok, you added this assertion in dcf94a23, but at that time there was no
bdrv_parent_drained_begin_single, and the polling was only done in
bdrv_do_drained_begin. So I think that to keep the same logic, the
assertion should be moved in bdrv_parent_drained_begin_single()? And
even more specifically, only if the poll flag is true.

I triggered this by adding additional drains in the callers of
bdrv_replace_child_noperm(), and I think some test (probably unit test)
was failing because of either the drained_begin callback itself called
by the drain, or as you suggested the callbacks called by
bdrv_parent_drained_begin_single from bdrv_replace_child_noperm.

Anyways, I think that in addition to the fix in this patch, we should
also fix bdrv_parent_drained_begin_single(poll=true) in
bdrv_replace_child_noperm, with something similar to what is done in
bdrv_co_yield_to_drain? ie if we are in coroutine, schedule a BH that
runs the same logic but in the main loop, but then somehow wait that it
finishes before continuing?
Even though at that point we would have a coroutine waiting for the main
loop, which I don't think it's something we want.

Alternatively, we would forbid polling in coroutines at all. And the
only place I can see that is using the drain in coroutine is mirror (see
below).


Additional question: I also noticed that there is a bdrv_drained_begin()
call in mirror.c in the JobDriver run() callback. How can this even
work? If a parent uses bdrv_child_cb_drained_begin (which should not be
so rare) it will crash because of the assertion.

Further additional question: actually I don't understand also the
polling logic of mirror (mirror_drained_poll), as if we are draining in
the coroutine with in_drain = true I think we can have a deadlock if
in_flight>0?

Emanuele

> 
> Now I don't know the call path up to bdrv_replace_child_noperm(), but as
> far as I remember, that was another function that was originally never
> run in coroutine context. Maybe we have good reason to change this, I
> can't point to anything that would be inherently wrong with it, but I
> would still be curious in which context it does run in a coroutine now.
> 
> Anyway, whatever the specific place is, I believe we must drop out of
> coroutine context _before_ calling bdrv_parent_drained_begin_single(),
> not only in callbacks called by it.
> 
>> Instead, bdrv_do_drained_begin with no recursion and poll
>> will accomplish the same thing (invoking bdrv_do_drained_begin_quiesce)
>> but will firstly check if we are already in a coroutine, and exit
>> from that via bdrv_co_yield_to_drain().
>>
>> Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
> 
> Kevin
> 




reply via email to

[Prev in Thread] Current Thread [Next in Thread]