qemu-stable
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH for-9.0] mirror: Don't call job_pause_point() under graph loc


From: Stefan Hajnoczi
Subject: Re: [PATCH for-9.0] mirror: Don't call job_pause_point() under graph lock
Date: Thu, 14 Mar 2024 10:29:39 -0400

On Wed, Mar 13, 2024 at 04:30:00PM +0100, Kevin Wolf wrote:
> Calling job_pause_point() while holding the graph reader lock
> potentially results in a deadlock: bdrv_graph_wrlock() first drains
> everything, including the mirror job, which pauses it. The job is only
> unpaused at the end of the drain section, which is when the graph writer
> lock has been successfully taken. However, if the job happens to be
> paused at a pause point where it still holds the reader lock, the writer
> lock can't be taken as long as the job is still paused.
> 
> Mark job_pause_point() as GRAPH_UNLOCKED and fix mirror accordingly.
> 
> Cc: qemu-stable@nongnu.org
> Buglink: https://issues.redhat.com/browse/RHEL-28125
> Fixes: 004915a96a7a40e942ac85e6d22518cbcd283506
> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
> ---
>  include/qemu/job.h |  2 +-
>  block/mirror.c     | 10 ++++++----
>  2 files changed, 7 insertions(+), 5 deletions(-)
> 
> diff --git a/include/qemu/job.h b/include/qemu/job.h
> index 9ea98b5927..2b873f2576 100644
> --- a/include/qemu/job.h
> +++ b/include/qemu/job.h
> @@ -483,7 +483,7 @@ void job_enter(Job *job);
>   *
>   * Called with job_mutex *not* held.
>   */
> -void coroutine_fn job_pause_point(Job *job);
> +void coroutine_fn GRAPH_UNLOCKED job_pause_point(Job *job);
>  
>  /**
>   * @job: The job that calls the function.
> diff --git a/block/mirror.c b/block/mirror.c
> index 5145eb53e1..1bdce3b657 100644
> --- a/block/mirror.c
> +++ b/block/mirror.c
> @@ -479,9 +479,9 @@ static unsigned mirror_perform(MirrorBlockJob *s, int64_t 
> offset,
>      return bytes_handled;
>  }
>  
> -static void coroutine_fn GRAPH_RDLOCK mirror_iteration(MirrorBlockJob *s)
> +static void coroutine_fn GRAPH_UNLOCKED mirror_iteration(MirrorBlockJob *s)
>  {
> -    BlockDriverState *source = s->mirror_top_bs->backing->bs;
> +    BlockDriverState *source;
>      MirrorOp *pseudo_op;
>      int64_t offset;
>      /* At least the first dirty chunk is mirrored in one iteration. */
> @@ -489,6 +489,10 @@ static void coroutine_fn GRAPH_RDLOCK 
> mirror_iteration(MirrorBlockJob *s)
>      bool write_zeroes_ok = 
> bdrv_can_write_zeroes_with_unmap(blk_bs(s->target));
>      int max_io_bytes = MAX(s->buf_size / MAX_IN_FLIGHT, MAX_IO_BYTES);
>  
> +    bdrv_graph_co_rdlock();
> +    source = s->mirror_top_bs->backing->bs;

Is bdrv_ref(source) needed here so that source cannot go away if someone
else write locks the graph and removes it? Or maybe something else
protects against that. Either way, please add a comment that explains
why this is safe.

> +    bdrv_graph_co_rdunlock();
> +
>      bdrv_dirty_bitmap_lock(s->dirty_bitmap);
>      offset = bdrv_dirty_iter_next(s->dbi);
>      if (offset < 0) {
> @@ -1066,9 +1070,7 @@ static int coroutine_fn mirror_run(Job *job, Error 
> **errp)
>                  mirror_wait_for_free_in_flight_slot(s);
>                  continue;
>              } else if (cnt != 0) {
> -                bdrv_graph_co_rdlock();
>                  mirror_iteration(s);
> -                bdrv_graph_co_rdunlock();
>              }
>          }
>  
> -- 
> 2.44.0
> 

Attachment: signature.asc
Description: PGP signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]