qemu-block
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH for-6.1? v2 5/7] job: Add job_cancel_requested()


From: Vladimir Sementsov-Ogievskiy
Subject: Re: [PATCH for-6.1? v2 5/7] job: Add job_cancel_requested()
Date: Tue, 3 Aug 2021 15:35:50 +0300
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.12.0

02.08.2021 13:23, Max Reitz wrote:
On 27.07.21 17:47, Vladimir Sementsov-Ogievskiy wrote:
27.07.2021 18:39, Max Reitz wrote:
On 27.07.21 15:04, Vladimir Sementsov-Ogievskiy wrote:
26.07.2021 17:46, Max Reitz wrote:
Most callers of job_is_cancelled() actually want to know whether the job
is on its way to immediate termination.  For example, we refuse to pause
jobs that are cancelled; but this only makes sense for jobs that are
really actually cancelled.

A mirror job that is cancelled during READY with force=false should
absolutely be allowed to pause.  This "cancellation" (which is actually
a kind of completion) may take an indefinite amount of time, and so
should behave like any job during normal operation.  For example, with
on-target-error=stop, the job should stop on write errors. (In
contrast, force-cancelled jobs should not get write errors, as they
should just terminate and not do further I/O.)

Therefore, redefine job_is_cancelled() to only return true for jobs that
are force-cancelled (which as of HEAD^ means any job that interprets the
cancellation request as a request for immediate termination), and add
job_cancel_request() as the general variant, which returns true for any

job_cancel_requested()

jobs which have been requested to be cancelled, whether it be
immediately or after an arbitrarily long completion phase.

Buglink: https://gitlab.com/qemu-project/qemu/-/issues/462
Signed-off-by: Max Reitz <mreitz@redhat.com>
---
  include/qemu/job.h |  8 +++++++-
  block/mirror.c     | 10 ++++------
  job.c              |  7 ++++++-
  3 files changed, 17 insertions(+), 8 deletions(-)

diff --git a/include/qemu/job.h b/include/qemu/job.h
index 8aa90f7395..032edf3c5f 100644
--- a/include/qemu/job.h
+++ b/include/qemu/job.h
@@ -436,9 +436,15 @@ const char *job_type_str(const Job *job);
  /** Returns true if the job should not be visible to the management layer. */
  bool job_is_internal(Job *job);
  -/** Returns whether the job is scheduled for cancellation. */
+/** Returns whether the job is being cancelled. */
  bool job_is_cancelled(Job *job);
  +/**
+ * Returns whether the job is scheduled for cancellation (at an
+ * indefinite point).
+ */
+bool job_cancel_requested(Job *job);
+
  /** Returns whether the job is in a completed state. */
  bool job_is_completed(Job *job);
  diff --git a/block/mirror.c b/block/mirror.c
index e93631a9f6..72e02fa34e 100644
--- a/block/mirror.c
+++ b/block/mirror.c
@@ -936,7 +936,7 @@ static int coroutine_fn mirror_run(Job *job, Error **errp)
          /* Transition to the READY state and wait for complete. */
          job_transition_to_ready(&s->common.job);
          s->actively_synced = true;
-        while (!job_is_cancelled(&s->common.job) && !s->should_complete) {
+        while (!job_cancel_requested(&s->common.job) && !s->should_complete) {
              job_yield(&s->common.job);
          }
          s->common.job.cancelled = false;
@@ -1043,7 +1043,7 @@ static int coroutine_fn mirror_run(Job *job, Error **errp)
              }
                should_complete = s->should_complete ||
-                job_is_cancelled(&s->common.job);
+ job_cancel_requested(&s->common.job);
              cnt = bdrv_get_dirty_count(s->dirty_bitmap);
          }
  @@ -1087,7 +1087,7 @@ static int coroutine_fn mirror_run(Job *job, Error 
**errp)
          trace_mirror_before_sleep(s, cnt, job_is_ready(&s->common.job),
                                    delay_ns);
          job_sleep_ns(&s->common.job, delay_ns);
-        if (job_is_cancelled(&s->common.job) && s->common.job.force_cancel) {
+        if (job_is_cancelled(&s->common.job)) {
              break;
          }
          s->last_pause_ns = qemu_clock_get_ns(QEMU_CLOCK_REALTIME);
@@ -1099,9 +1099,7 @@ immediate_exit:
           * or it was cancelled prematurely so that we do not guarantee that
           * the target is a copy of the source.
           */
-        assert(ret < 0 ||
-               (s->common.job.force_cancel &&
-                job_is_cancelled(&s->common.job)));
+        assert(ret < 0 || job_is_cancelled(&s->common.job));

(As a note, I hope this does the job regarding your suggestions for patch 4. :))

          assert(need_drain);
          mirror_wait_for_all_io(s);
      }
diff --git a/job.c b/job.c
index e78d893a9c..dba17a680f 100644
--- a/job.c
+++ b/job.c
@@ -216,6 +216,11 @@ const char *job_type_str(const Job *job)
  }
    bool job_is_cancelled(Job *job)
+{
+    return job->cancelled && job->force_cancel;

can job->cancelled be false when job->force_cancel is true ? I think not and 
worth an assertion here. Something like

if (job->force_cancel) {
   assert(job->cancelled);
   return true;
}

return false;

Sounds good, why not.


+}
+
+bool job_cancel_requested(Job *job)
  {
      return job->cancelled;
  }
@@ -1015,7 +1020,7 @@ void job_complete(Job *job, Error **errp)
      if (job_apply_verb(job, JOB_VERB_COMPLETE, errp)) {
          return;
      }
-    if (job_is_cancelled(job) || !job->driver->complete) {
+    if (job_cancel_requested(job) || !job->driver->complete) {
          error_setg(errp, "The active block job '%s' cannot be completed",
                     job->id);
          return;


I think it's a correct change, although there may be unexpected side-effects, 
it's hard to imagine all consequences of changing job_is_cancelled() semantics 
called in several places in job.c.

For example: so we now don't set -ECANCELED in job_update_rc for soft-cancel..

This mean that job_finalize_single() will call job_commit instead of job_abort, 
and job_commit may do some graph changes, which shouldn't happen for soft-cancel

So the question is when these two conditions come into play.

There are two places that set job->ret to ECANCELED if the job is cancelled, 
namely job_update_rc(), and job_finish_sync().

job_finish_sync() will do so only after the job has been completed, which requires 
the job to either have been aborted (i.e. ret is non-zero anyway) or 
job_completed() to have been called. job_completed() is called by job_exit(), 
which is run after the job’s main loop has exited.  If mirror is soft-cancelled, 
mirror_run() will clear s->common.job.cancelled before returning, so 
job_finish_sync() will not see the job as cancelled.

job_update_rc() is called from three places:

job_finalize_single(): Asserts that job_is_completed(), so the same reasoning 
as for job_finish_sync() applies.

job_prepare(): Called by job_do_finalize(), which can only happen when the job 
is completed.  (JobVerbTable only allows finalization when the job is PENDING, 
which is a state where job_is_completed() is true, i.e. after mirror_run().)

job_completed(): Same reasoning as for job_finish_sync().


So it looks to me like these places that set job->ret to ECANCELED if the job 
has been cancelled do not consider a soft-cancelled mirror job to have been 
cancelled, which makes using job_is_cancelled() instead of job_cancel_requested() 
correct there. (And most likely, we can drop the `.cancelled = false` statements 
from the mirror job in turn.)



Hm, reasonable. OK than, thanks for explanation.


--
Best regards,
Vladimir



reply via email to

[Prev in Thread] Current Thread [Next in Thread]