qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [RFC] Replace posix-aio with custom thread pool


From: Andrea Arcangeli
Subject: Re: [Qemu-devel] [RFC] Replace posix-aio with custom thread pool
Date: Fri, 12 Dec 2008 15:13:33 +0100

On Fri, Dec 12, 2008 at 12:54:21PM +0100, Jens Axboe wrote:
> I agree completely. The buffered aio patches got pretty involved though,
> it wasn't real pretty in the end. So it never got merged. Looks like the
> most realistic way forward is some variant of syslet (or the acall stuff
> that Zach has been working on), which is largely a cop out and will
> never perform as well.

It'll at least perform better a brand new userland pool of threads for
each task that needs aio functionality, and it can be later optimized
if we want ;).

But I'm surprised, the aio patches in 2.4 were very clean, we didn't
have to break filesystems, it was really a nice done work, enterprise
quality as demonstrated by the several databases running on it for
years.  Ironically the O_DIRECT part didn't work at the
time... because effectively the O_DIRECT part is more difficult. So
2.6 has the hard stuff done and misses the simpler stuff. I guess the
simpler stuff is harder to merge as it has more users.

Well I hope it'll be fixed... for kvm/qemu we definitely require aio
for buffered reads too (buffered writes aren't a big deal but reads
are). For the parent images it makes sense to run them in buffered
mode even on servers using O_DIRECT, so basically we can't use
linux-aio until this is fixed somehow.

In the meantime I think it'd be better to -EINVAL (so the userland
thread can fallback to userland thread pool) instead of just behaving
synchronously that can break GUI and interactive behavior...

> I added CLONE_IO some time ago to avoid that, so it's perfectly possible
> to share cfq io contexts with threads or processes even in userspace!

It's available in recent kernels I see! so the fix is easy. Only
problem is how to pass CLONE_IO to pthread_create... We'll have to
make a linux-only change and call clone by hand under some #ifdef
CLONE_IO.




reply via email to

[Prev in Thread] Current Thread [Next in Thread]