qemu-block
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-block] [Qemu-devel] Block layer complexity: what to do to keep


From: Stefan Hajnoczi
Subject: Re: [Qemu-block] [Qemu-devel] Block layer complexity: what to do to keep it under control?
Date: Fri, 1 Dec 2017 14:08:33 +0000
User-agent: Mutt/1.9.1 (2017-09-22)

On Fri, Dec 01, 2017 at 06:16:44PM +0800, Fam Zheng wrote:
> On Thu, 11/30 14:19, Stefan Hajnoczi wrote:
> > On Thu, Nov 30, 2017 at 05:47:09PM +0800, Fam Zheng wrote:
> > > On Wed, 11/29 12:00, Stefan Hajnoczi wrote:
> > > > On Wed, Nov 29, 2017 at 11:55:02AM +0800, Fam Zheng wrote:
> > > > 
> > > > Event loops and coroutines are good but they should not be used directly
> > > > by block drivers and block jobs.  We need safe, high-level APIs that
> > > > implement commonly-used operations.
> > > > 
> > > > > - Documentation
> > > > > 
> > > > >   There is no central developer doc about block layer, especially how 
> > > > > all pieces
> > > > >   fit together. Having one will make it a lot easier for new 
> > > > > contributors to
> > > > >   understand better. Of course, we're facing the old problem: the 
> > > > > code is
> > > > >   moving, maintaining an updated document needs effort.
> > > > > 
> > > > >   Idea: add ./doc/deve/block.txt?
> > > > 
> > > > IOThreads and AioContexts are addressed here:
> > > > docs/devel/multiple-iothreads.txt
> > > > 
> > > > The game has become significantly more complex than what the document
> > > > describes.  It's lacking aio_co_wake() and aio_co_schedule() for
> > > > example.
> > > > 
> > > > > - Simplified code, or more orthogonal/modularized architecture.
> > > > > 
> > > > >   Each aspect of block layer is complex enough so isolating them as 
> > > > > much as
> > > > >   possible is a reasonable approach to control the complexity. Block 
> > > > > jobs and
> > > > >   throttling becoming block filters is a good example, we should 
> > > > > identify more.
> > > > > 
> > > > >   Idea: rethink event loops. Create coroutines ubiquitously (for 
> > > > > example for
> > > > >   each fd handler, BH and timer), so that many nested aio_poll() can 
> > > > > be removed.
> > > > > 
> > > > >   Crazy idea: move the whole block layer to a vhost process, and 
> > > > > implement
> > > > >   existing features differently, especially in terms of 
> > > > > multi-threading (hint:
> > > > >   rust?).
> > > > 
> > > > A reimplementation will not solve the problem because:
> > > > 
> > > > 1. If it still has the same feature set and requirements then the level
> > > >    of complexity will be comparable.
> > > > 
> > > > 2. We can reduce accidental (inessential) complexity by continuing the
> > > >    various efforts around the block graph, block jobs, multi-queue block
> > > >    layer with an eye towards higher level APIs.
> > > 
> > > Starting over is certainly not the motivation to do qemu-vhost, but it 
> > > would be
> > > an opportunity to use different async/concurrency paradigms if that is 
> > > going to
> > > happen. I think in current block layer, event loop + coroutine is a good
> > > combination, but having nested aio_poll()'s made it worse, then mixing 
> > > IOThreads
> > > in makes it a lot more complicated.
> > 
> > Why alternative model are you thinking of?
> 
> To utilize whatever is offered in the different language. In particular I've
> heard good things about rust (without programming it myself) that doing
> concurrency correctly is easier with it. We'll probably lose all the good bits
> about coroutine (unlike what is special in Go), but I expects using simpler
> concurrency models (IOW threads only) can lead to simpler code. (I have no
> problem with coroutine excpet the debuggability problem I pointed out, which
> hopefully can be solved by writing more gdb extensions.)

[A long rant here but I hope it contains useful points.]

Rust's threading model is 1:1.  Besides mutexes it also has channels
(looks similar to Go and communicating sequential processes-style
channels).

It is probably not feasible to make each I/O request a thread (i.e. 1M
IOPS means creating/destroying 1M threads/sec).  There would have to be
some machinery like a request queue and a thread pool to process
requests.  That way requests can be held before those that are ready to
be executed can run as threads.  Hmm...this sounds similar to what we
have.

The important thing is this:

Coroutines with a single event loop (current model in QEMU) are simpler
than threads.  Why?  Because coroutine code is atomic with respect to
other coroutines in the same event loop.  Only yield points or nested
event loops allow other coroutines to execute.  That means less explicit
synchronization is necessary.

When the block layer goes multiqueue this advantage will be lost and
coroutine code will have to synchronize explicitly just like threaded
code.  Coroutines will remain lighter weight than threads and will allow
M:N threading to be configured via IOThreads.

I'm not hopeful that dropping coroutines helps and I don't see that Rust
brings anything new to the table here.  I do like other aspects of Rust
and am open to using it for new code.

What I'm getting at is that the essential complexity of a parallel I/O
engine with a runtime reconfigurable graph, background operations, and
non-trivial disk image file formats sets a certain floor (minimum)
complexity.  The consequence is that our code needs to handle
concurrency and this is where we fall down today.

Switching programming language will not reduce complexity below this
floor.  We'd have to abandon features in order to reduce complexity.
"Let's start from scratch" attempts often hope to do this but ultimately
users don't want to lose features.

We need to think through the corner cases we've been hitting and in the
process of fixing them we should consider if simplified interfaces/APIs
with less concurrency would allow us to write better code at the expense
of less control and performance in cases where we don't need it.

That means insulate block jobs and drivers as much as possible.  Don't
make them use low-level primitives.  Make them use high-level APIs where
the concurrency is baked in and as safe as we can make it without giving
up performance.

> Another thing about rust is it can call into C code so maybe the change can be
> done incrementally like suggested by Dan in his libvirt discussion about using
> Go:
> 
> https://www.redhat.com/archives/libvir-list/2017-November/msg00528.html

Max's qcow2 in Rust driver didn't make me very optimistic about mixed C
and Rust.  Putting a Rust layer on top of C looks sane.  I don't think
you can mix C and Rust side-by-side without a lot of duplication and
boilerplate, but my knowledge is limited to looking at Max's qcow2
driver.

Attachment: signature.asc
Description: PGP signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]