qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[PATCH v2 09/10] child_job_drained_poll: override polling condition only


From: Emanuele Giuseppe Esposito
Subject: [PATCH v2 09/10] child_job_drained_poll: override polling condition only when in home thread
Date: Mon, 14 Mar 2022 09:18:53 -0400

drv->drained_poll() is only implemented in mirror, and allows
it to drain from within the coroutine. The mirror implementation uses
in_drain flag to recognize when it is draining from coroutine,
and consequently avoid deadlocking (wait the poll condition in
child_job_drained_poll to wait for itself).

The problem is that this flag is dangerous, because it breaks
bdrv_drained_begin() invariants: once drained_begin ends, all
jobs, in_flight requests, and anything running in the iothread
are blocked.

This can be broken in such way:
iothread(mirror): s->in_drain = true; // mirror.c:1112
main loop: bdrv_drained_begin(mirror_bs);
/*
 * drained_begin wait for bdrv_drain_poll_top_level() condition,
 * that translates in child_job_drained_poll() for jobs, but
 * mirror implements drv->drained_poll() so it returns
 * !!in_flight_requests, which his 0 (assertion in mirror.c:1105).
 */
main loop: thinks iothread is stopped and is modifying the graph...
iothread(mirror): *continues*, as nothing is stopping it
iothread(mirror): bdrv_drained_begin(bs);
/* draining reads the graph while it is modified!! */
main loop: done modifying the graph...

In order to fix this, we can simply allow drv->drained_poll()
to be called only by the iothread, and not the main loop.
We distinguish it by using in_aio_context_home_thread(), that
returns false if @ctx is not the same as the thread that runs it.

Co-Developed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
---
 blockjob.c | 13 ++++++++++---
 1 file changed, 10 insertions(+), 3 deletions(-)

diff --git a/blockjob.c b/blockjob.c
index 4868453d74..14a919b3cc 100644
--- a/blockjob.c
+++ b/blockjob.c
@@ -110,7 +110,9 @@ static bool child_job_drained_poll(BdrvChild *c)
     BlockJob *bjob = c->opaque;
     Job *job = &bjob->job;
     const BlockJobDriver *drv = block_job_driver(bjob);
+    AioContext *ctx;
 
+    ctx = job->aio_context;
     /* An inactive or completed job doesn't have any pending requests. Jobs
      * with !job->busy are either already paused or have a pause point after
      * being reentered, so no job driver code will run before they pause. */
@@ -118,9 +120,14 @@ static bool child_job_drained_poll(BdrvChild *c)
         return false;
     }
 
-    /* Otherwise, assume that it isn't fully stopped yet, but allow the job to
-     * override this assumption. */
-    if (drv->drained_poll) {
+    /*
+     * Otherwise, assume that it isn't fully stopped yet, but allow the job to
+     * override this assumption, if the drain is being performed in the
+     * iothread. We need to check that the caller is the home thread because
+     * it could otherwise lead the main loop to exit polling while the job
+     * has not paused yet.
+     */
+    if (in_aio_context_home_thread(ctx) && drv->drained_poll) {
         return drv->drained_poll(bjob);
     } else {
         return true;
-- 
2.31.1




reply via email to

[Prev in Thread] Current Thread [Next in Thread]