[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] [PATCH v7 22/24] block: Rewrite bdrv_close_all()
From: |
Fam Zheng |
Subject: |
Re: [Qemu-devel] [PATCH v7 22/24] block: Rewrite bdrv_close_all() |
Date: |
Wed, 18 Nov 2015 10:52:59 +0800 |
User-agent: |
Mutt/1.5.21 (2010-09-15) |
On Mon, 11/16 17:15, Max Reitz wrote:
> >> @@ -1971,13 +1969,44 @@ static void bdrv_close(BlockDriverState *bs)
> >> void bdrv_close_all(void)
> >> {
> >> BlockDriverState *bs;
> >> + AioContext *aio_context;
> >> + int original_refcount = 0;
> >>
> >> - QTAILQ_FOREACH(bs, &bdrv_states, device_list) {
> >> - AioContext *aio_context = bdrv_get_aio_context(bs);
> >> + /* Drop references from requests still in flight, such as canceled
> >> block
> >> + * jobs whose AIO context has not been polled yet */
> >> + bdrv_drain_all();
> >>
> >> - aio_context_acquire(aio_context);
> >> - bdrv_close(bs);
> >> - aio_context_release(aio_context);
> >> + blockdev_close_all_bdrv_states();
> >> + blk_remove_all_bs();
> >> +
> >> + /* Cancel all block jobs */
> >> + while (!QTAILQ_EMPTY(&all_bdrv_states)) {
> >> + QTAILQ_FOREACH(bs, &all_bdrv_states, bs_list) {
> >> + aio_context = bdrv_get_aio_context(bs);
> >> +
> >> + aio_context_acquire(aio_context);
> >> + if (bs->job) {
> >> + /* So we can safely query the current refcount */
> >> + bdrv_ref(bs);
> >> + original_refcount = bs->refcnt;
> >> +
> >> + block_job_cancel_sync(bs->job);
> >> + aio_context_release(aio_context);
> >> + break;
> >> + }
> >> + aio_context_release(aio_context);
> >> + }
> >> +
> >> + /* All the remaining BlockDriverStates are referenced directly or
> >> + * indirectly from block jobs, so there needs to be at least one
> >> BDS
> >> + * directly used by a block job */
> >> + assert(bs);
> >> +
> >> + /* Wait for the block job to release its reference */
> >> + while (bs->refcnt >= original_refcount) {
> >> + aio_poll(aio_context, true);
> >> + }
> >> + bdrv_unref(bs);
> >
> > If at this point bs->refcnt is greater than 1, why don't we care where are
> > the
> > remaining references from?
>
> We do care. A BDS will not be removed from all_bdrv_states until it is
> deleted (i.e. its refcount becomes 0). Therefore, this loop will
> continue until all BDSs have been deleted.
>
> So where might additional references come from? Since this loop only
> cares about direct or indirect references from block jobs, that's
> exactly it:
>
> (1) You might have multiple block jobs running on a BDS in the future.
> Then, you'll cancel them one by one, and after having canceled the
> first one, the refcount will still be greater than one before the
> bdrv_unref().
>
> (2) Imagine a BDS A with a parent BDS B. There are block jobs running on
> both of them. Now, B is referenced by both its block job and by A
> (indirectly by the block job referencing A). If we cancel the job on
> B before the one on A, then the refcount on B will still be greater
> than 1 before bdrv_unref() because it is still referenced by its
> parent (until the block job on A is canceled, too).
>
> The first cannot happen right now, but the second one may, I'm not sure
> (depending on whether op blockers allow it).
OK, that makes sense. The all_bdrv_states is the central place to make sure all
refcnts reaching 0.
>
>
> And we do make sure that there are no additional references besides:
> - directly from the monitor (monitor-owned BDSs)
> - from a BB
> - from block jobs
> - (+ everything transitively through the respective BDS tree)
>
> If there were additional references, the inner loop would at some point
> no longer find a BDS with a block job while all_bdrv_states is still not
> empty. That's what the "assert(bs)" after the inner loop is for.
>
> If you can imagine another way where a reference to a BDS may come from,
> that would be a bug in this patch and we have to make sure to respect
> that case, too.
I think the only one reference I'm not sure is in xen_disk. blk_unref is called
in the .free and .disconnect call backs but I have no idea if they are called
before bdrv_close_all.
Otherwise this patch looks good for me.
Thanks,
Fam