[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] [PATCH v2] block: Fix bdrv_drain in coroutine
From: |
Stefan Hajnoczi |
Subject: |
Re: [Qemu-devel] [PATCH v2] block: Fix bdrv_drain in coroutine |
Date: |
Tue, 5 Apr 2016 10:39:56 +0100 |
User-agent: |
Mutt/1.5.24 (2015-08-30) |
On Tue, Apr 05, 2016 at 09:27:39AM +0800, Fam Zheng wrote:
> On Mon, 04/04 12:57, Stefan Hajnoczi wrote:
> > On Fri, Apr 01, 2016 at 09:57:38PM +0800, Fam Zheng wrote:
> > > + BdrvCoDrainData data;
> > > +
> > > + assert(qemu_in_coroutine());
> > > + data = (BdrvCoDrainData) {
> > > + .co = qemu_coroutine_self(),
> > > + .bs = bs,
> > > + .done = false,
> > > + .bh = aio_bh_new(bdrv_get_aio_context(bs), bdrv_co_drain_bh_cb,
> > > &data),
> > > + };
> > > + qemu_bh_schedule(data.bh);
> > > +
> > > + qemu_coroutine_yield();
> > > + /* If we are resumed from some other event (such as an aio
> > > completion or a
> > > + * timer callback), it is a bug in the caller that should be fixed.
> > > */
> > > + assert(data.done);
> > > +}
> > > +
> > > /*
> > > * Wait for pending requests to complete on a single BlockDriverState
> > > subtree,
> > > * and suspend block driver's internal I/O until next request arrives.
> > > @@ -269,6 +306,10 @@ void bdrv_drain(BlockDriverState *bs)
> > > bool busy = true;
> > >
> > > bdrv_drain_recurse(bs);
> > > + if (qemu_in_coroutine()) {
> > > + bdrv_co_drain(bs);
> > > + return;
> > > + }
> > > while (busy) {
> > > /* Keep iterating */
> > > bdrv_flush_io_queue(bs);
> > > --
> > > 2.7.4
> >
> > block/mirror.c should call bdrv_co_drain() explicitly and bdrv_drain()
> > should assert(!qemu_in_coroutine()).
> >
> > The reason why existing bdrv_read() and friends detect coroutine context
> > at runtime is because it is meant for legacy code that runs in both
> > coroutine and non-coroutine contexts.
>
> blk_unref/bdrv_unref are called in coroutine (.bdrv_create implementations),
> and this doesn't just work with the assertion. Should I clean up this "legacy"
> code first, i.e. move bdrv_unref calls to BHs in the callers and
> assert(!qemu_in_coroutine()) there too? I didn't think this because it
> complicates the code somehow.
This is a messy problem.
In general I don't like introducing yields into non-coroutine_fn
functions because it can lead to bugs when the caller didn't expect a
yield point.
For example, I myself wouldn't have expected bdrv_unref() to be a yield
point. So maybe coroutine code I've written would be vulnerable to
interference (I won't call it a race condition) from another coroutine
across the bdrv_unref() call. This could mean that another coroutine
now sees intermediate state that would never be visible without the new
yield point.
I think attempting to invoke qemu_co_queue_run_restart() directly
instead of scheduling a BH and yielding does not improve the situation.
It's also a layering violation since qemu_co_queue_run_restart() is just
meant for the core coroutine code and isn't a public interface.
Anyway, let's consider bdrv_drain() legacy code that can call if
(qemu_in_coroutine()) but please make bdrv_co_drain() public so
block/mirror.c can at least call it directly.
Stefan
signature.asc
Description: PGP signature