qemu-block
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH] block: bdrv_set_backing_hd(): use drained section


From: Vladimir Sementsov-Ogievskiy
Subject: Re: [PATCH] block: bdrv_set_backing_hd(): use drained section
Date: Tue, 1 Feb 2022 14:00:12 +0300
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.2.0

28.01.2022 17:12, Emanuele Giuseppe Esposito wrote:


On 27/01/2022 15:13, Kevin Wolf wrote:
Am 25.01.2022 um 11:12 hat Vladimir Sementsov-Ogievskiy geschrieben:
25.01.2022 12:24, Paolo Bonzini wrote:
On 1/24/22 18:37, Vladimir Sementsov-Ogievskiy wrote:
Graph modifications should be done in drained section. stream_prepare()
handler of block stream job call bdrv_set_backing_hd() without using
drained section and it's theoretically possible that some IO request
will interleave with graph modification and will use outdated pointers
to removed block nodes.

Some other callers use bdrv_set_backing_hd() not caring about drained
sections too. So it seems good to make a drained section exactly in
bdrv_set_backing_hd().

Emanuele has a similar patch in his series to protect all graph
modifications with drains:

@@ -3456,6 +3478,11 @@ int bdrv_set_backing_hd(BlockDriverState *bs, 
BlockDriverState *backing_hd,

       assert(qemu_in_main_thread());

+    bdrv_subtree_drained_begin_unlocked(bs);
+    if (backing_hd) {
+        bdrv_subtree_drained_begin_unlocked(backing_hd);
+    }
+
       ret = bdrv_set_backing_noperm(bs, backing_hd, tran, errp);
       if (ret < 0) {
           goto out;
@@ -3464,6 +3491,10 @@ int bdrv_set_backing_hd(BlockDriverState *bs, 
BlockDriverState *backing_hd,
       ret = bdrv_refresh_perms(bs, errp);
   out:
       tran_finalize(tran, ret);
+    if (backing_hd) {
+        bdrv_subtree_drained_end_unlocked(backing_hd);
+    }
+    bdrv_subtree_drained_end_unlocked(bs);

       return ret;
   }

so the idea at least is correct.

I don't object to fixing this independently, but please check
1) if a subtree drain would be more appropriate, 2) whether
backing_hd should be drained as well, 3) whether we're guaranteed
to be holding the AioContext lock as required for
bdrv_drained_begin/end.


Hmm.

1. Subtree draining of backing_hd will not help, as bs is not drained,
we still may have in-fight request in bs, touching old bs->backing.

What do you mean bs is not drained? In my patch I drain both.


2. I think non-recursive drain of bs is enough. We modify only bs
node, so we should drain it. backing_hd itself is not modified. If
backing_hd participate in some other backing chain - it's not touched,
and in-flight requests in that other chain are not broken by
modification, so why to drain it? Same for old bs->backing and other
bs children. We are not interested in in-flight requests in subtree
which are not part of request in bs. So, if no inflight requests in
bs, we can modify bs and not care about requests in subtree.

I agree on both points. Emanuele's patch seems to be doing unnecessary
work there.

Wait, the point of my patches[*] is to protect
bdrv_replace_child_noperm(). See the cover letter of my series for more
info.

The reason for a subtree drain is that one callback of .attach() in
bdrv_replace_child_noperm() is bdrv_child_cb_attach().
This attaches the node in child->opaque->children list, so both nodes
pointed by the BdrvChild are modified (child->bs and child->opaque).
Simply draining on child->bs won't be enough to also get child->opaque
in my opinion[*].

So you mean that backing_hd is modified as its parent is changed, do I 
understand correctly?

Yes, it is modified in this point of view, but why this needs drain? If root bs 
is drained, we don't have in-flight requests touching this modified 
parent-child relationship.


Same applies with detach.
One interesting side note is what happens if we are moving the child
from one bs to another (old_bs and new_bs are both != NULL):
child->opaque will just lose and re-gain the same child.

Regarding this specific drain: I am missing something for sure here,
because if I try to follow the code I see that from
bdrv_set_backing_hd(bs, backing_hd)
the call stack eventually ends up to
bdrv_replace_child_noperm(child, new_bs /* -> backing_hd */)
and then the graph modification there is:
QLIST_INSERT_HEAD(&new_bs->parents, child, next_parent);

So why not protecting backing_hd?

Full stack:
bdrv_set_backing_hd(bs, backing_hd)
  bdrv_set_backing_noperm(bs, backing_hd)
   bdrv_set_file_or_backing_noperm(bs, backing_hd)
     bdrv_attach_child_noperm(parent_bs = bs, child_bs = backing_hd)
      bdrv_attach_child_common(child_bs = backing_hd, .., parent_bs = bs)
       new_child.bs = NULL;
       new_child.opaque = parent_bs = bs;
       bdrv_replace_child_noperm(new_child, child_bs = backing_hd)

[*] = BTW, I see that you understand this stuff way deeper than I do, so
feel free to review my drained-related series if you have time :)


3. Jobs are bound to aio context, so I believe that they care to hold
AioContext lock. For example, on path job_prepare may be called
through job_exit(), job_exit() does
aio_context_acquire(job->aio_context), or it may be called through
job_cancel(), which seems to be called under aio_context_acquire() as
well. So, seems in general we care about it, and of course
bdrv_set_backing_hd() must be called with AioContext lock held. If for
some code path it isn't, it's a bug..

We do have some code that does exactly that: In the main thread, we
often don't hold the AioContext lock, but only the BQL. I find it quite
ugly, but it works as long as the node is in the main AioContext.

One path where this is relevant is bdrv_open_inherit() ->
bdrv_open_backing_file() -> bdrv_set_backing_hd(). This one is harmless
because we know that we just created the new node in the main
AioContext.

All the other paths seem to come either from jobs (which take the
AioContext as you explained) or directly from monitor commands, which I
just checked to take the lock as well.


This won't hold anymore when the job patches are applied. So that is why
in my case subtree_drained_begin/end_unlocked() works.

Thank you,
Emanuele



--
Best regards,
Vladimir



reply via email to

[Prev in Thread] Current Thread [Next in Thread]