qemu-block
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [RFC] block-backend: prevent dangling BDS pointer in blk_drain()


From: Stefan Hajnoczi
Subject: Re: [RFC] block-backend: prevent dangling BDS pointer in blk_drain()
Date: Mon, 13 Dec 2021 10:06:40 +0000

On Fri, Dec 10, 2021 at 03:00:38PM +0100, Kevin Wolf wrote:
> Am 09.12.2021 um 15:23 hat Stefan Hajnoczi geschrieben:
> > The BlockBackend root child can change during bdrv_drained_begin() when
> > aio_poll() is invoked. In fact the BlockDriverState can reach refcnt 0
> > and blk_drain() is left with a dangling BDS pointer.
> > 
> > One example is scsi_device_purge_requests(), which calls blk_drain() to
> > wait for in-flight requests to cancel. If the backup blockjob is active,
> > then the BlockBackend root child is a temporary filter BDS owned by the
> > blockjob. The blockjob can complete during bdrv_drained_begin() and the
> > last reference to the BDS is released when the temporary filter node is
> > removed. This results in a use-after-free when blk_drain() calls
> > bdrv_drained_end(bs) on the dangling pointer.
> > 
> > The general problem is that a function and its callers must not assume
> > that bs is still valid across aio_poll(). Explicitly hold a reference to
> > bs in blk_drain() to avoid the dangling pointer.
> > 
> > Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
> > ---
> > I found that BDS nodes are sometimes deleted with bs->quiesce_counter >
> > 0 (at least when running "make check"), so it is currently not possible
> > to put the bdrv_ref/unref() calls in bdrv_do_drained_begin() and
> > bdrv_do_drained_end() because they will be unbalanced. That would have
> > been a more general solution than only fixing blk_drain().
> 
> They are not supposed to end up unbalanced because detaching a child
> calls bdrv_unapply_subtree_drain(). In fact, I think test-bdrv-drain
> tests a few scenarios like this.
> 
> Do have more details about the case that failed for you?

You're right. I put the assert(bs->quiesce_counter == 0) statement to
early in bdrv_delete(). Moving it after bdrv_close() shows that the
counter reaches 0 thanks to the bdrv_drain_all_end_quiesce() call in
bdrv_close().

Stefan

Attachment: signature.asc
Description: PGP signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]