[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] [PATCH for 2.9 v3 09/10] block: Use bdrv_coroutine_ente
From: |
Kevin Wolf |
Subject: |
Re: [Qemu-devel] [PATCH for 2.9 v3 09/10] block: Use bdrv_coroutine_enter to start I/O coroutines |
Date: |
Tue, 11 Apr 2017 12:06:42 +0200 |
User-agent: |
Mutt/1.5.21 (2010-09-15) |
Am 10.04.2017 um 17:05 hat Fam Zheng geschrieben:
> BDRV_POLL_WHILE waits for the started I/O by releasing bs's ctx then polling
> the main context, which relies on the yielded the coroutine would continue on
> bs->ctx and notify qemu_aio_context with bdrv_wakeup(). Thus, using
> qemu_coroutine_enter to start I/O is wrong because if the coroutine is entered
> from main loop, co->ctx will be qemu_aio_context, as a result of the "release,
> poll, acquire" loop of BDRV_POLL_WHILE, race conditions happen when both
> main thread and the iothread access the same BDS:
>
> main loop iothread
> -----------------------------------------------------------------------
> blockdev_snapshot
> aio_context_acquire(bs->ctx)
> virtio_scsi_data_plane_handle_cmd
> bdrv_drained_begin(bs->ctx)
> bdrv_flush(bs)
> bdrv_co_flush(bs)
> aio_context_acquire(bs->ctx).enter
> ...
> qemu_coroutine_yield(co)
> BDRV_POLL_WHILE()
> aio_context_release(bs->ctx)
>
> aio_context_acquire(bs->ctx).return
> ...
> aio_co_wake(co)
> aio_poll(qemu_aio_context) ...
> co_schedule_bh_cb() ...
> qemu_coroutine_enter(co) ...
>
> /* (A) bdrv_co_flush(bs) /* (B) I/O on bs */
> continues... */
> aio_context_release(bs->ctx)
> aio_context_acquire(bs->ctx)
>
> Note that in above case, bdrv_drained_begin() doesn't do the "release,
> poll, acquire" in BDRV_POLL_WHILE, because bs->in_flight == 0.
>
> Fix this by using bdrv_coroutine_enter and enter coroutine in the right
> context.
>
> iotests 109 output is updated because the coroutine reenter flow during
> mirror job complete is different (now through co_queue_wakeup, instead
> of the unconditional qemu_coroutine_switch before), making the end job
> len different.
>
> Signed-off-by: Fam Zheng <address@hidden>
>
> fixup
> ---
> block/block-backend.c | 4 ++--
> block/io.c | 14 +++++++-------
These changes are okay.
The question is whether we need more of them. We do have a few more
callers of qemu_coroutine_create():
* blkverify, quorum: Here, we are already in a coroutine context and
therefore the caller has made sure that we're in the right AioContext.
The usual functions simply inherit it, which is correct.
* nbd-client: Has its own handling for getting the coroutine in to the
right AioContexts, I'll assume it's okay.
* sheepdog:
co_read_response() calls aio_co_wake() immediately after
qemu_coroutine_create(). Is this allowed? I don't think co->ctx is
even initialised at this time. Is this a bug introduced in Paolo's
9d456654? ('block: explicitly acquire aiocontext in callbacks that
need it') Anyway, called only in AioContext callbacks, so we're fine.
do_req() is called from the main loop context in a few instances. Not
sure if this is a problem, but maybe using bdrv_coroutine_enter()
there would be safer.
* 9p, migration: Outside the block layer, trivially ok
* nbd/server: nbd_co_client_start() runs in the caller's AioContext.
Should be fine, it doesn't directly issue any requests, but just
spawns other coroutines that are put in the right AioContext.
* qemu-img, tests: Nobody cares about AioContexts there
So I think we should have another look at Sheepdog, the rest seems to be
fine.
> tests/qemu-iotests/109.out | 10 +++++-----
This one is curious. Why do we copy more data depending on how the job
coroutine is reentered?
Kevin
- Re: [Qemu-devel] [PATCH for 2.9 v3 03/10] tests/block-job-txn: Don't start block job before adding to txn, (continued)
- [Qemu-devel] [PATCH for 2.9 v3 04/10] coroutine: Extract qemu_aio_coroutine_enter, Fam Zheng, 2017/04/10
- [Qemu-devel] [PATCH for 2.9 v3 05/10] async: Introduce aio_co_enter and aio_co_enter_if_inactive, Fam Zheng, 2017/04/10
- [Qemu-devel] [PATCH for 2.9 v3 06/10] block: Introduce bdrv_coroutine_enter and *_if_inactive, Fam Zheng, 2017/04/10
- [Qemu-devel] [PATCH for 2.9 v3 07/10] blockjob: Use bdrv_coroutine_enter to start coroutine, Fam Zheng, 2017/04/10
- [Qemu-devel] [PATCH for 2.9 v3 08/10] qemu-io-cmds: Use bdrv_coroutine_enter, Fam Zheng, 2017/04/10
- [Qemu-devel] [PATCH for 2.9 v3 09/10] block: Use bdrv_coroutine_enter to start I/O coroutines, Fam Zheng, 2017/04/10
- [Qemu-devel] [PATCH for 2.9 v3 10/10] block: Fix bdrv_co_flush early return, Fam Zheng, 2017/04/10
- Re: [Qemu-devel] [PATCH for 2.9 v3 00/10] block: Fixes regarding dataplane and management operations, Stefan Hajnoczi, 2017/04/11
- Re: [Qemu-devel] [PATCH for 2.9 v3 00/10] block: Fixes regarding dataplane and management operations, Kevin Wolf, 2017/04/11
- [Qemu-devel] [PATCH for 2.9 v3 11/10] block, async: Remove unused *_enter_if_inactive(), Kevin Wolf, 2017/04/11