qemu-block
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-block] [Qemu-devel] segfault in parallel blockjobs (iotest 30)


From: Fam Zheng
Subject: Re: [Qemu-block] [Qemu-devel] segfault in parallel blockjobs (iotest 30)
Date: Thu, 9 Nov 2017 14:05:26 +0800
User-agent: Mutt/1.9.1 (2017-09-22)

On Wed, 11/08 18:50, Anton Nefedov wrote:
> 
> 
> On 8/11/2017 5:45 PM, Alberto Garcia wrote:
> > On Tue 07 Nov 2017 05:19:41 PM CET, Anton Nefedov wrote:
> > > BlockBackend gets deleted by another job's stream_complete(), deferred
> > > to the main loop, so the fact that the job is put to sleep by
> > > bdrv_drain_all_begin() doesn't really stop it from execution.
> > 
> > I was debugging this a bit, and the block_job_defer_to_main_loop() call
> > happens _after_ all jobs have been paused, so I think that when the BDS
> > is drained then stream_run() finishes the last iteration without
> > checking if it's paused.
> > 
> > Without your patch (i.e. with a smaller STREAM_BUFFER_SIZE) then I
> > assume that the function would have to continue looping and
> > block_job_sleep_ns() would make the job coroutine yield, effectively
> > pausing the job and preventing the crash.
> > 
> > I can fix the crash by adding block_job_pause_point(&s->common) at the
> > end of stream_run() (where the 'out' label is).
> > 
> > I'm thinking that perhaps we should add the pause point directly to
> > block_job_defer_to_main_loop(), to prevent any block job from running
> > the exit function when it's paused.
> > 
> 
> Is it possible that the exit function is already deferred when the jobs
> are being paused? (even though it's at least less likely to happen)
> 
> Then should we flush the bottom halves somehow in addition to putting
> the jobs to sleep? And also then it all probably has to happen before
> bdrv_reopen_queue()

Or we can stash away the BH temporarily during pause period:

---

diff --git a/blockjob.c b/blockjob.c
index 3a0c49137e..7058ff7ae1 100644
--- a/blockjob.c
+++ b/blockjob.c
@@ -127,6 +127,9 @@ void block_job_txn_add_job(BlockJobTxn *txn, BlockJob *job)
 static void block_job_pause(BlockJob *job)
 {
     job->pause_count++;
+    if (job->defer_to_main_loop_bh) {
+        qemu_bh_cancel(job->defer_to_main_loop_bh);
+    }
 }
 
 static void block_job_resume(BlockJob *job)
@@ -136,6 +139,9 @@ static void block_job_resume(BlockJob *job)
     if (job->pause_count) {
         return;
     }
+    if (job->defer_to_main_loop_bh) {
+        qemu_bh_schedule(job->defer_to_main_loop_bh);
+    }
     block_job_enter(job);
 }
 
@@ -900,6 +906,9 @@ static void block_job_defer_to_main_loop_bh(void *opaque)
     /* Prevent race with block_job_defer_to_main_loop() */
     aio_context_acquire(data->aio_context);
 
+    qemu_bh_delete(data->job->defer_to_main_loop_bh);
+    data->job->defer_to_main_loop_bh = NULL;
+
     /* Fetch BDS AioContext again, in case it has changed */
     aio_context = blk_get_aio_context(data->job->blk);
     if (aio_context != data->aio_context) {
@@ -927,7 +936,9 @@ void block_job_defer_to_main_loop(BlockJob *job,
     data->fn = fn;
     data->opaque = opaque;
     job->deferred_to_main_loop = true;
+    assert(!job->defer_to_main_loop_bh);
+    job->defer_to_main_loop_bh = aio_bh_new(qemu_get_aio_context(),
+                                 block_job_defer_to_main_loop_bh, data);
 
-    aio_bh_schedule_oneshot(qemu_get_aio_context(),
-                            block_job_defer_to_main_loop_bh, data);
+    qemu_bh_schedule(job->defer_to_main_loop_bh);
 }
diff --git a/include/block/blockjob.h b/include/block/blockjob.h
index 67c0968fa5..28d29b3be6 100644
--- a/include/block/blockjob.h
+++ b/include/block/blockjob.h
@@ -97,6 +97,11 @@ typedef struct BlockJob {
      */
     bool deferred_to_main_loop;
 
+    /**
+     * The main loop BH for "defer to main loop" operation.
+     */
+    QEMUBH *defer_to_main_loop_bh;
+
     /** Element of the list of block jobs */
     QLIST_ENTRY(BlockJob) job_list;
 



reply via email to

[Prev in Thread] Current Thread [Next in Thread]