qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] coroutines and block I/O considerations


From: Kevin Wolf
Subject: Re: [Qemu-devel] coroutines and block I/O considerations
Date: Tue, 19 Jul 2011 12:10:43 +0200
User-agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.2.17) Gecko/20110428 Fedora/3.1.10-1.fc15 Thunderbird/3.1.10

Am 19.07.2011 10:06, schrieb Frediano Ziglio:
>   I'm exercise myself in block I/O layer and I decided to test
> coroutine branch cause I find it easier to use instead of normal
> callback. Looking at normal code there are a lot of rows in source to
> save/restore state and declare callbacks and is not that easier to
> understand the normal flow. 

Yes. This is one of the reasons why we're trying to switch to
coroutines. QED is a prototype for a fully asynchronous callback-based
image format, and sometimes it's really hard to follow its code paths.
That the real functionality gets lost in the noise of transferring state
doesn't really help with readability either.

> At the end I would like to create a new
> image format to get rid of some performance problem I encounter using
> writethrough and snapshots. I have some questions regard block I/O and
> also coroutines

No. A new image format is the wrong answer, whatever the question may
be. :-)

If writethrough doesn't perform well with the existing format drivers,
fix the existing format drivers. You need very good reasons to convince
me that qcow2 can't do what your new format could do.

The solution for slow writethrough mode in qcow2 is probably to make
requests parallel, even if they touch metadata. This is a change that
becomes possible relatively easily once we have switched to coroutines.

What exactly is the problem with snapshots? Saving/loading internal
snapshots is too slow, or general performance with an image that has
snapshots? I think Luiz reported the first one a while ago, and it
should be easy enough to fix (use Qcow2Cache in writeback mode during
the refcount update).

> 1- threading model. I don't understand it. I can see that aio pool
> routines does not contain locking code so I think aio layer is mainly
> executed in a single thread. I saw introduction of some locking using
> coroutines so I think coroutines are now called from different threads
> and needs lock (current implementation serialize all device
> operations)

You can view coroutines as threads with cooperative scheduling. That is,
unlike threads a coroutine is never interrupted by a scheduler, but it
can only call qemu_coroutine_yield(), which transfers control to a
different coroutine. Compared to threads this simplifies locking a bit
because you exactly know at which point other code may run.

But of course, even though you know where it happens, you have other
code running in the middle of your function,  so there can be a need to
lock things, which is why there are things like a CoMutex.

They are still all running in the same thread.

> 2- memory considerations on coroutines. Beside coroutines allow more
> readable code I wonder if somebody considered memory. For every
> coroutines a different stack has to be allocated. For instance
> ucontext and win32 implementation use 4mb. Assuming 128 concurrent AIO
> this require about 512mb of ram (mostly only committed but not used
> and coroutines are reused).

128 concurrent requests is a lot. And even then, it's only virtual
memory. I doubt that we're actually using much more than we do in the
old code with the AIOCBs (which will disappear and become local
variables when we complete the conversion).

> About snapshot and block i/o I think that using "external snapshot"
> would help making some stuff easier. By "external snapshot" I mean
> creating a new image with backing file as current image file and using
> this new image for future operations. This would allow for instance
> - support snapshot with every format (even raw)
> - making snapshot backup using external programs (even from different
> hosts using clustered file system and without many locking issues as
> original image is now read-only)
> - convert images live (just snapshot, qemu-img convert, remove snapshot)

These are things that are actively worked on. snapshot_blkdev is a
monitor command that already exists and does exactly what you describe.
For the rest, live block copy and image streaming are the keywords that
you should be looking for. We've had quite some discussions on these in
the past few weeks. You may also be interested in this wiki page:
http://wiki.qemu.org/Features/LiveBlockMigration

Kevin



reply via email to

[Prev in Thread] Current Thread [Next in Thread]