[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [PATCH] block: bdrv_set_backing_hd(): use drained section
From: |
Emanuele Giuseppe Esposito |
Subject: |
Re: [PATCH] block: bdrv_set_backing_hd(): use drained section |
Date: |
Fri, 28 Jan 2022 15:12:11 +0100 |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.2.0 |
On 27/01/2022 15:13, Kevin Wolf wrote:
> Am 25.01.2022 um 11:12 hat Vladimir Sementsov-Ogievskiy geschrieben:
>> 25.01.2022 12:24, Paolo Bonzini wrote:
>>> On 1/24/22 18:37, Vladimir Sementsov-Ogievskiy wrote:
>>>> Graph modifications should be done in drained section. stream_prepare()
>>>> handler of block stream job call bdrv_set_backing_hd() without using
>>>> drained section and it's theoretically possible that some IO request
>>>> will interleave with graph modification and will use outdated pointers
>>>> to removed block nodes.
>>>>
>>>> Some other callers use bdrv_set_backing_hd() not caring about drained
>>>> sections too. So it seems good to make a drained section exactly in
>>>> bdrv_set_backing_hd().
>>>
>>> Emanuele has a similar patch in his series to protect all graph
>>> modifications with drains:
>>>
>>> @@ -3456,6 +3478,11 @@ int bdrv_set_backing_hd(BlockDriverState *bs,
>>> BlockDriverState *backing_hd,
>>>
>>> assert(qemu_in_main_thread());
>>>
>>> + bdrv_subtree_drained_begin_unlocked(bs);
>>> + if (backing_hd) {
>>> + bdrv_subtree_drained_begin_unlocked(backing_hd);
>>> + }
>>> +
>>> ret = bdrv_set_backing_noperm(bs, backing_hd, tran, errp);
>>> if (ret < 0) {
>>> goto out;
>>> @@ -3464,6 +3491,10 @@ int bdrv_set_backing_hd(BlockDriverState *bs,
>>> BlockDriverState *backing_hd,
>>> ret = bdrv_refresh_perms(bs, errp);
>>> out:
>>> tran_finalize(tran, ret);
>>> + if (backing_hd) {
>>> + bdrv_subtree_drained_end_unlocked(backing_hd);
>>> + }
>>> + bdrv_subtree_drained_end_unlocked(bs);
>>>
>>> return ret;
>>> }
>>>
>>> so the idea at least is correct.
>>>
>>> I don't object to fixing this independently, but please check
>>> 1) if a subtree drain would be more appropriate, 2) whether
>>> backing_hd should be drained as well, 3) whether we're guaranteed
>>> to be holding the AioContext lock as required for
>>> bdrv_drained_begin/end.
>>>
>>
>> Hmm.
>>
>> 1. Subtree draining of backing_hd will not help, as bs is not drained,
>> we still may have in-fight request in bs, touching old bs->backing.
What do you mean bs is not drained? In my patch I drain both.
>>
>> 2. I think non-recursive drain of bs is enough. We modify only bs
>> node, so we should drain it. backing_hd itself is not modified. If
>> backing_hd participate in some other backing chain - it's not touched,
>> and in-flight requests in that other chain are not broken by
>> modification, so why to drain it? Same for old bs->backing and other
>> bs children. We are not interested in in-flight requests in subtree
>> which are not part of request in bs. So, if no inflight requests in
>> bs, we can modify bs and not care about requests in subtree.
>
> I agree on both points. Emanuele's patch seems to be doing unnecessary
> work there.
Wait, the point of my patches[*] is to protect
bdrv_replace_child_noperm(). See the cover letter of my series for more
info.
The reason for a subtree drain is that one callback of .attach() in
bdrv_replace_child_noperm() is bdrv_child_cb_attach().
This attaches the node in child->opaque->children list, so both nodes
pointed by the BdrvChild are modified (child->bs and child->opaque).
Simply draining on child->bs won't be enough to also get child->opaque
in my opinion[*].
Same applies with detach.
One interesting side note is what happens if we are moving the child
from one bs to another (old_bs and new_bs are both != NULL):
child->opaque will just lose and re-gain the same child.
Regarding this specific drain: I am missing something for sure here,
because if I try to follow the code I see that from
bdrv_set_backing_hd(bs, backing_hd)
the call stack eventually ends up to
bdrv_replace_child_noperm(child, new_bs /* -> backing_hd */)
and then the graph modification there is:
QLIST_INSERT_HEAD(&new_bs->parents, child, next_parent);
So why not protecting backing_hd?
Full stack:
bdrv_set_backing_hd(bs, backing_hd)
bdrv_set_backing_noperm(bs, backing_hd)
bdrv_set_file_or_backing_noperm(bs, backing_hd)
bdrv_attach_child_noperm(parent_bs = bs, child_bs = backing_hd)
bdrv_attach_child_common(child_bs = backing_hd, .., parent_bs = bs)
new_child.bs = NULL;
new_child.opaque = parent_bs = bs;
bdrv_replace_child_noperm(new_child, child_bs = backing_hd)
[*] = BTW, I see that you understand this stuff way deeper than I do, so
feel free to review my drained-related series if you have time :)
>
>> 3. Jobs are bound to aio context, so I believe that they care to hold
>> AioContext lock. For example, on path job_prepare may be called
>> through job_exit(), job_exit() does
>> aio_context_acquire(job->aio_context), or it may be called through
>> job_cancel(), which seems to be called under aio_context_acquire() as
>> well. So, seems in general we care about it, and of course
>> bdrv_set_backing_hd() must be called with AioContext lock held. If for
>> some code path it isn't, it's a bug..
>
> We do have some code that does exactly that: In the main thread, we
> often don't hold the AioContext lock, but only the BQL. I find it quite
> ugly, but it works as long as the node is in the main AioContext.
>
> One path where this is relevant is bdrv_open_inherit() ->
> bdrv_open_backing_file() -> bdrv_set_backing_hd(). This one is harmless
> because we know that we just created the new node in the main
> AioContext.
>
> All the other paths seem to come either from jobs (which take the
> AioContext as you explained) or directly from monitor commands, which I
> just checked to take the lock as well.
>
This won't hold anymore when the job patches are applied. So that is why
in my case subtree_drained_begin/end_unlocked() works.
Thank you,
Emanuele