[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] [RFC v4 12/21] blockjobs: ensure abort is called for ca
From: |
Kevin Wolf |
Subject: |
Re: [Qemu-devel] [RFC v4 12/21] blockjobs: ensure abort is called for cancelled jobs |
Date: |
Wed, 28 Feb 2018 17:05:18 +0100 |
User-agent: |
Mutt/1.9.1 (2017-09-22) |
Am 24.02.2018 um 00:51 hat John Snow geschrieben:
> Presently, even if a job is canceled post-completion as a result of
> a failing peer in a transaction, it will still call .commit because
> nothing has updated or changed its return code.
>
> The reason why this does not cause problems currently is because
> backup's implementation of .commit checks for cancellation itself.
>
> I'd like to simplify this contract:
>
> (1) Abort is called if the job/transaction fails
> (2) Commit is called if the job/transaction succeeds
>
> To this end: A job's return code, if 0, will be forcibly set as
> -ECANCELED if that job has already concluded. Remove the now
> redundant check in the backup job implementation.
Thanks, another fix that addresses a comment I made on an earlier patch!
> We need to check for cancellation in both block_job_completed
> AND block_job_completed_single, because jobs may be cancelled between
> those two calls; for instance in transactions.
>
> The check in block_job_completed could be removed, but there's no
> point in starting to attempt to succeed a transaction that we know
> in advance will fail.
>
> This does NOT affect mirror jobs that are "canceled" during their
> synchronous phase. The mirror job itself forcibly sets the canceled
> property to false prior to ceding control, so such cases will invoke
> the "commit" callback.
Which is a sign that mirror is abusing the interface. But yes, we need
to keep it working.
> Signed-off-by: John Snow <address@hidden>
> ---
> block/backup.c | 2 +-
> block/trace-events | 1 +
> blockjob.c | 19 +++++++++++++++----
> 3 files changed, 17 insertions(+), 5 deletions(-)
>
> diff --git a/block/backup.c b/block/backup.c
> index 7e254dabff..453cd62c24 100644
> --- a/block/backup.c
> +++ b/block/backup.c
> @@ -206,7 +206,7 @@ static void backup_cleanup_sync_bitmap(BackupBlockJob
> *job, int ret)
> BdrvDirtyBitmap *bm;
> BlockDriverState *bs = blk_bs(job->common.blk);
>
> - if (ret < 0 || block_job_is_cancelled(&job->common)) {
> + if (ret < 0) {
> /* Merge the successor back into the parent, delete nothing. */
> bm = bdrv_reclaim_dirty_bitmap(bs, job->sync_bitmap, NULL);
> assert(bm);
> diff --git a/block/trace-events b/block/trace-events
> index 266afd9e99..5e531e0310 100644
> --- a/block/trace-events
> +++ b/block/trace-events
> @@ -5,6 +5,7 @@ bdrv_open_common(void *bs, const char *filename, int flags,
> const char *format_n
> bdrv_lock_medium(void *bs, bool locked) "bs %p locked %d"
>
> # blockjob.c
> +block_job_completed(void *job, int ret, int jret) "job %p ret %d corrected
> ret %d"
> block_job_state_transition(void *job, int ret, const char *legal, const
> char *s0, const char *s1) "job %p (ret: %d) attempting %s transition
> (%s-->%s)"
> block_job_apply_verb(void *job, const char *state, const char *verb, const
> char *legal) "job %p in state %s; applying verb %s (%s)"
>
> diff --git a/blockjob.c b/blockjob.c
> index 4d29391673..ef17dea004 100644
> --- a/blockjob.c
> +++ b/blockjob.c
> @@ -384,13 +384,22 @@ void block_job_start(BlockJob *job)
> bdrv_coroutine_enter(blk_bs(job->blk), job->co);
> }
>
> +static void block_job_update_rc(BlockJob *job)
> +{
> + if (!job->ret && block_job_is_cancelled(job)) {
> + job->ret = -ECANCELED;
> + }
> + if (job->ret) {
> + block_job_state_transition(job, BLOCK_JOB_STATUS_ABORTING);
> + }
> +}
> +
> static void block_job_completed_single(BlockJob *job)
> {
> assert(job->completed);
>
> - if (job->ret || block_job_is_cancelled(job)) {
> - block_job_state_transition(job, BLOCK_JOB_STATUS_ABORTING);
> - }
> + /* Ensure abort is called for late-transactional failures */
> + block_job_update_rc(job);
>
> if (!job->ret) {
> if (job->driver->commit) {
> @@ -898,7 +907,9 @@ void block_job_completed(BlockJob *job, int ret)
> assert(blk_bs(job->blk)->job == job);
> job->completed = true;
> job->ret = ret;
> - if (ret < 0 || block_job_is_cancelled(job)) {
> + block_job_update_rc(job);
Hmm... We are transitioning to ABORTED here now. This means that in
block_job_completed_single() we get an ABORTED -> ABORTED transition.
A bit ugly, but not really a problem. If you decide to remove the
s0 == s1 special case in block_job_state_transition(), you'll need to
allow this transition in the table.
Reviewed-by: Kevin Wolf <address@hidden>