[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [RFC PATCH resend 00/11] Make CoMutex/CoQueue/CoRwlock

From: Stefan Hajnoczi
Subject: Re: [Qemu-devel] [RFC PATCH resend 00/11] Make CoMutex/CoQueue/CoRwlock thread-safe
Date: Tue, 26 Apr 2016 11:54:19 +0100
User-agent: Mutt/1.5.24 (2015-08-30)

On Fri, Apr 15, 2016 at 01:31:55PM +0200, Paolo Bonzini wrote:
> [this time including the mailing list]
> This is yet another tiny bit of the multiqueue work, this time affecting
> the synchronization infrastructure for coroutines.  Currently, coroutines
> synchronize between the main I/O thread and the dataplane iothread through
> the AioContext lock.  However, for multiqueue a single BDS will be used
> by multiple iothreads and hence multiple AioContexts.  This calls for
> a different approach to coroutine synchronization.  This series is my
> first attempt at it.  Besides multiqueue, I think throttling groups
> (which can already span multiple AioContexts) could also take advantage
> of the new CoMutexes.
> The series has two main parts:
> - it makes CoMutex bind coroutines to an AioContexts.  It is of course
>   still possible to move coroutines around with explicit yield and
>   qemu_coroutine_enter, but a CoMutex will now reenter the coroutine
>   where it was running before, rather than in the AioContext of
>   the previous owner.  To do this, a new function aio_co_schedule is
>   introduced to run a coroutine on a given iothread.  I think this could
>   be used in other places too; for now it lets a CoMutex protect stuff
>   across multiple AioContexts without them moving around(*).  Of course
>   currently a CoMutex is generally not used across multiple iothreads,
>   because you have to acquire/release the AioContext around CoMutex
>   critical sections.  However...
> - ... the second change is exactly to make CoMutex thread-safe and remove
>   the need for a "thread-based" mutex around it.  The new CoMutex is
>   exactly the same as a mutex implementation that you'd find in an
>   operating system.  iothreads are the moral equivalent of CPUs in
>   a kernel, while coroutines resemble kernel threads running without
>   preemption on a CPU.  Even if you have to take concurrency into account,
>   the lack of preemption while running coroutines or bottom halves
>   keeps the complexity at bay.  For example, it is easy to synchronize
>   between qemu_co_mutex_lock's yield and the qemu_coroutine_enter in
>   aio_co_schedule's bottom half.
>   Same as before, CoMutex puts coroutines to sleep with
>   qemu_coroutine_yield and wake them up with the new aio_co_schedule.
>   I could have wrapped CoMutex's CoQueue with a "regular" thread mutex or
>   spinlock.  The resulting code would have looked a lot like RFifoLock
>   (with CoQueue replacing RFifoLock's condition variable).  Rather,
>   inspired by the parallel between coroutines and non-preemptive kernel
>   threads, I chose to adopt the same lock-free mutex algorithm as OSv.
>   The algorithm only needs two to four atomic ops for a lock-unlock pair
>   (two when uncontended).  To cover CoQueue, each CoQueue is made to
>   depend on a CoMutex, similar to condition variables.  Most CoQueues
>   already have a corresponding CoMutex so this is not a big deal;
>   converting the others is left for a future series.  I did this because
>   CoQueue looks a lot like OSv's waitqueues; so if necessary, we could
>   even take OSv's support for wait morphing (which avoids the thundering
>   herd problem) and add it to CoMutex and CoQueue.  This may be useful
>   when making tracked_requests thread-safe.
> Kevin: this has nothing to do with my old plan from Brno, and it's
> possibly a lot closer to what you wanted.  Your idea was to automatically
> release the "thread mutex" when a coroutine yields, I think you should
> be fine with not having a thread mutex at all!
> This will need some changes in the formats because, for multiqueue,
> CoMutexes would need to be used like "real" thread mutexes.  Code like
> this:
>     ...
>     qemu_co_mutex_unlock()
>     ... /* still access shared data, but don't yield */
>     qemu_coroutine_yield()
> might be required to use this other pattern:
>     ... /* access shared data, but don't yield */
>     qemu_co_mutex_unlock()
>     qemu_coroutine_yield()
> because "adding a second CPU" is already introducing concurrency that
> wasn't there before.  The "non-preemptive multitasking" reference only
> applies to things that access AioContext-local data.  This includes the
> synchronization primitives implemented in this series, the thread pool,
> the Linux AIO support, but not much else.  It still simplifies _those_
> though. :)
> Anyhow, we'll always have some BlockDriver that do not support multiqueue,
> such as the network protocols.  Thus it would be possible to handle the
> formats one at a time.  raw-posix, raw and qcow2 would already form a
> pretty good set, and the first two do not use CoMutex at all.
> The patch has quite a lot of new code, but about half of it is testcases.
> The new test covers correctness and (when run with --verbose) also takes a
> stab at measuring performance; the results is that performance of CoMutex
> is comparable to pthread mutexes.  The only snag is that that you cannot
> make a direct comparison between CoMutex (fair) and pthread_mutex_t
> (unfair).  For this reason the testcase also measures performance of a
> quick-and-dirty implementation of a fair mutex, based on MCS locks +
> futexes.
> There's a lot of meat in the above text, and I hope it will make the code
> clearer and compensate for the terse commit messages. :)  I'll probably
> write a single-threaded testcase too, just to provide some more unit
> test comparison of "before" and "after".
> I haven't even started a guest with this patches, let alone run
> qemu-iotests... generally the changes are well confined to unit tested
> code, patch 2 for example is completely untested.  There are a couple
> other places that at least need more comments, but I wanted to throw
> the patch out for an early review of the general approach.
> Paolo Bonzini (11):
>   coroutine: use QSIMPLEQ instead of QTAILQ
>   throttle-groups: restart throttled requests from coroutine context
>   coroutine: delete qemu_co_enter_next
>   aio: introduce aio_co_schedule
>   coroutine-lock: reschedule coroutine on the AioContext it was running on
>   coroutine-lock: make CoMutex thread-safe
>   coroutine-lock: add limited spinning to CoMutex
>   test-aio-multithread: add performance comparison with thread-based mutexes
>   coroutine-lock: place CoMutex before CoQueue in header
>   coroutine-lock: add mutex argument to CoQueue APIs
>   coroutine-lock: make CoRwlock thread-safe and fair
>  async.c                      |  38 ++++
>  block/backup.c               |   2 +-
>  block/io.c                   |   2 +-
>  block/qcow2-cluster.c        |   4 +-
>  block/sheepdog.c             |   2 +-
>  block/throttle-groups.c      |  71 +++++--
>  hw/9pfs/9p.c                 |   2 +-
>  include/block/aio.h          |  14 ++
>  include/qemu/atomic.h        |   3 +
>  include/qemu/coroutine.h     | 106 ++++++----
>  include/qemu/coroutine_int.h |  14 +-
>  iothread.c                   |  16 ++
>  stubs/iothread.c             |  11 ++
>  tests/Makefile               |  11 +-
>  tests/iothread.c             | 107 ++++++++++
>  tests/iothread.h             |  25 +++
>  tests/test-aio-multithread.c | 452 
> +++++++++++++++++++++++++++++++++++++++++++
>  trace-events                 |   8 +-
>  util/qemu-coroutine-lock.c   | 257 ++++++++++++++++++++----
>  util/qemu-coroutine.c        |   2 +-
>  20 files changed, 1037 insertions(+), 110 deletions(-)
>  create mode 100644 tests/iothread.c
>  create mode 100644 tests/iothread.h
>  create mode 100644 tests/test-aio-multithread.c

Looks promising, modulo questions and comments I posted on patches.

I wonder whether it would be cleaner to introduce new primitives instead
of modifying CoMutex/CoQueue.  That way it would be clear whether code
is written to be thread-safe or not.

Attachment: signature.asc
Description: PGP signature

reply via email to

[Prev in Thread] Current Thread [Next in Thread]