qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Qemu-devel] Block I/O outside the QEMU global mutex was "Re: [RFC PATCH


From: Stefan Hajnoczi
Subject: [Qemu-devel] Block I/O outside the QEMU global mutex was "Re: [RFC PATCH 00/17] Support for multiple "AIO contexts""
Date: Tue, 9 Oct 2012 11:08:11 +0200
User-agent: Mutt/1.5.21 (2010-09-15)

On Mon, Oct 08, 2012 at 03:00:04PM +0200, Paolo Bonzini wrote:
> Il 08/10/2012 13:39, Stefan Hajnoczi ha scritto:
> > 2. Thread pool for dispatching I/O requests outside the QEMU global mutex.
> 
> I looked at this in the past and it feels like a dead end to me.  I had
> a lot of special code in the thread-pool to mimic yield/enter of
> threadpool work-items.  It was needed mostly for I/O throttling, but
> also because it feels unsafe to swap a CoMutex with a Mutex---the
> waiting I/O operations can starve the threadpool.
> 
> I now think it is simpler to keep a cooperative coroutine-based
> multitasking in the general case.  At the same time you can ensure that
> AIO formats (both Linux and posix-aio-compat) gets a suitable
> no-coroutine fast path in the common case of no copy-on-read, no
> throttling, etc. -- which can be done in the current code too.

You're right.  Initially we can keep coroutines and add aio fastpaths.  There's
no need to make invasive block layer changes to coroutines -> threads yet.

> Another important step would be to add bdrv_drain.  Kevin pointed out to
> me that only ->file and ->backing_hd need to be drained.  Well, there
> may be other BlockDriverStates for vmdk extents or similar cases
> (Benoit's quorum device for example)... these need to be handled the
> same way for bdrv_flush, bdrv_reopen, bdrv_drain so perhaps it is useful
> to add a common way to get them.
> 
> And you need a lock to the AioContext, too.  Then the block device can
> we the AioContext lock in order to synchronize multiple threads working
> on the block device.  The lock will effectively block the ioeventfd
> thread, so that bdrv_lock+bdrv_drain+...+bdrv_unlock is a replacement
> for the current usage of bdrv_drain_all within the QEMU lock.
> 
> > I'm starting to work on these steps and will send RFCs. This series
> > looks good to me.
> 
> Thanks!  A lot of the next steps can be done in parallel and more
> importantly none of them blocks each other (roughly)... so I'm eager to
> look at your stuff! :)

Some notes on moving virtio-blk processing out of the QEMU global mutex:

1. Dedicated thread for non-QEMU mutex virtio ioeventfd processing.
   The point of this thread is to process without the QEMU global mutex, using
   only fine-grained locks.  (In the future this thread can be integrated back
   into the QEMU iothread when the global mutex has been eliminated.)

   Dedicated thread must hold reference to virtio-blk device so it will
   not be destroyed.  Hot unplug requires asking ioeventfd processing
   threads to release reference.

2. Versions of virtqueue_pop() and virtqueue_push() that execute outside
   global QEMU mutex.  Look at memory API and threaded device dispatch.

   The virtio device itself must have a lock so its vring-related state
   can be modified safely.

Here are the steps that have been mentioned:

1. aio fastpath - for raw-posix and other aio block drivers, can we reduce I/O
   request latency by skipping block layer coroutines?  This is can be
   prototyped (hacked) easily to scope out how much benefit we get.  It's
   completely independent from the global mutex related work.

2. BlockDriverState <-> AioContext attach.  Allows I/O requests to be processed
   by in event loops other than the QEMU iothread.

3. bdrv_drain() and BlockDriverState synchronization.  Make it safe to use
   BlockDriverState outside the QEMU mutex and ensure that bdrv_drain() works.

4. Unlocked event loop thread.  This is simlar to QEMU's iothread except it
   doesn't take the big lock.  In theory we could have several of these threads
   processing at the same time.  virtio-blk ioeventfd processing will be done
   in this thread.

5. virtqueue_pop()/virtqueue_push() without QEMU global mutex.  Before this is
   implemented we could temporarily acquire/release the QEMU global mutex.

Let me run benchmarks on the aio fastpath.  I'm curious how much difference it
makes.

I'm also curious about virtqueue_pop()/virtqueue_push() outside the QEMU mutex
although that might be blocked by the current work around MMIO/PIO dispatch
outside the global mutex.

Stefan



reply via email to

[Prev in Thread] Current Thread [Next in Thread]