[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH][RFC] Linux AIO support when using O_DIRECT

From: Anthony Liguori
Subject: Re: [Qemu-devel] [PATCH][RFC] Linux AIO support when using O_DIRECT
Date: Mon, 23 Mar 2009 13:10:30 -0500
User-agent: Thunderbird (X11/20090320)

Christoph Hellwig wrote:
On Mon, Mar 23, 2009 at 12:14:58PM -0500, Anthony Liguori wrote:
I'd like to see the O_DIRECT bounce buffering removed in favor of the DMA API bouncing. Once that happens, raw_read and raw_pread can disappear. block-raw-posix becomes much simpler.

See my vectored I/O patches for doing the bounce buffering at the
optimal place for the aio path. Note that from my reading of the
qcow/qcow2 code they might send down unaligned requests, which is
something the dma api would not help with.

I was going to look today at applying those.

For the buffered I/O path we will always have to do some sort of buffering
due to all the partition header reading / etc.  And given how that part
isn't performance critical my preference would be to keep doing it in
bdrv_pread/write and guarantee the lowlevel drivers proper alignment.

I really dislike having so many APIs. I'd rather have an aio API that took byte accesses or have pread/pwrite always be emulated with a full sector read/write

We would drop the signaling stuff and have the thread pool use an fd to signal. The big problem with that right now is that it'll cause a performance regression for certain platforms until we have the IO thread in place.

Talking about signaling, does anyone remember why the Linux signalfd/
eventfd support is only in kvm but not in upstream qemu?

Because upstream QEMU doesn't yet have an IO thread.

TCG chains together TBs and if you have a tight loop in a VCPU, then the only way to break out of the loop is to receive a signal. The signal handler will call cpu_interrupt() which will unchain TBs allowing TCG execution to break once you return from the signal handler.

An IO thread solves this in a different way by letting select() always run in parallel to TCG VCPU execution. When select() returns you can send a signal to the TCG VCPU thread to break it out of chained TBs.

Not all IO in qemu generates a signal so this a potential problem but in practice, if we don't generate a signal for disk IO completion, a number of real world guests breaks (mostly non-x86 boards).


Anthony Liguori

reply via email to

[Prev in Thread] Current Thread [Next in Thread]