[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] [PATCH 1/1] virtio-blk: fix race on guest notifiers
From: |
Paolo Bonzini |
Subject: |
Re: [Qemu-devel] [PATCH 1/1] virtio-blk: fix race on guest notifiers |
Date: |
Mon, 6 Mar 2017 15:55:19 +0100 |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.7.0 |
On 03/03/2017 20:43, Halil Pasic wrote:
> Uh, this is complicated. I'm not out of questions, but I fear taking to
> much of your precious time. I will ask again nevertheless, but please
> just cut the conversation with -EBUSY if it gets to expensive.
It's the opposite! I need other people to look at it, understand it and
poke holes at my own reasoning.
> bs_->wakeup = true; \
> while ((cond)) { \
> aio_context_release(ctx_); \
> aio_poll(qemu_get_aio_context(), true); \
> aio_context_acquire(ctx_); \
> waited_ = true; \
> } \
> bs_->wakeup = false; \
> } \
> waited_; })
>
> where it's interesting (me running with 2 iothreads one assigned to my
> block device) we are gonna take the "else" branch , a will end up
> releasing the ctx belonging to the iothread and then acquire it again,
> and basically wait till the requests are done.
>
> Since is virtio_blk_rw_complete acquiring-releasing the same ctx this
> makes a lot of sense. If we would not release the ctx we would end up
> with a deadlock.
Related to this, note that the above release/wait/acquire logic is
itself new to 2.8. Before, QEMU would run aio_poll(other_aio_context)
directly in the main thread. This relied on recursive mutexes and
special callbacks to pass the lock between the I/O and main threads.
This worked but it hid some thread-unsafe idioms, so I removed this in
the commits ending at 65c1b5b ("iothread: release AioContext around
aio_poll", 2016-10-28). BDRV_POLL_WHILE provides an abstraction that
will not change once aio_context_acquire/release disappears in favor of
fine-grained locks.
> But then I do not understand what is the point acquiring
> the mutex in virtio_blk_rw_complete?
When QEMU does I/O in the main thread, it can call
BlockBackend/BlockDriverState functions. Even though their completion
is processed in the I/O thread (via BDRV_POLL_WHILE), you still need a
lock to handle mutual exclusion between the two.
In the case of virtio_blk_rw_complete the mutex is needed because, for
example, block_acct_done is not thread safe yet.
> Is patch b9e413dd required for the correctness of this patch?
> What is the role of the aio_context_acquire/release introduced by
> b9e413dd in virtio_blk_rw_complete?
Patch b9e413dd should have no semantic change. The acquire/release that
are added are all balanced by removed acquire/releases elsewhere, for
example in block/linux-aio.c.
Paolo