qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH 1/5] Add target memory mapping API


From: Avi Kivity
Subject: Re: [Qemu-devel] [PATCH 1/5] Add target memory mapping API
Date: Mon, 19 Jan 2009 20:43:06 +0200
User-agent: Thunderbird 2.0.0.19 (X11/20090105)

Jamie Lokier wrote:
Avi Kivity wrote:
In fact, we could even say that the virtual hardware doesn't support dma-to-mmio at all and MCE the guest. I'm sure no x86 guest would even notice. Don't know about non-x86.

Guest userspace does:

   1. mmap() framebuffer device.
   2. read() from file opened with O_DIRECT.

Both are allowed by non-root processes on Linux.

(I imagine this might be more common in some obscure DOS programs though).

Think also variation with reading from a video capture device into
video memory.  I've seen that done on x86, never seen it (yet)
on non-x86 :-)

However, that is known to break on some PCI bridges.

I'm not sure if it's reasonable to abort emulation with an MCE in this
case.


Framebuffers are mapped as RAM, so we won't bounce this case. Try harder :)

I think my question about partial DMA writes is very relevant.  If we
don't care about that, nor about the corresponding notification for
reads, then the API can be a lot simpler.
I don't see a concrete reason to care about it.

Writing zeros or junk after a partial DMA is quite different to real
hardware behaviour.  Virtually all devices with a "DMA count"
register are certain to have not written to a later address when DMA stops.


The devices we're talking about here don't have a DMA count register. They are passed scatter-gather lists, and I don't think they make guarantees about the order in which they're accessed.

QEMU tries to do a fairly good job at emulating devices with many of
their quirks.  It would be odd if the high-performance API got in the
way of high-quality device emulation, when that's wanted.

Potential example: If a graphics card or video capture card, or USB
webcam etc. (more likely!) is doing a large streaming DMA into a
guests's userspace process when that process calls read() (in the
guest OS), and the DMA is stopped for any reason, such as triggered by
a guest OS SIGINT or simply the data having ended, the guest's
userspace can reasonably assume data after the count returned by
read() is untouched.

This DMA will be into RAM, not mmio.

Just as importantly, the guest OS in that example can assume that the
later pages are not dirtied, therefore not swap them, or return them
to its pre-zero pool or whatever.  This is a legitimate guest OS
optimisation for streaming-DMA-with-unknown-length devices.  This can
happen without a userspace process too.

I'm guessing truncated DMAs using this API are always going to dirty
only an initial part of the buffer, not arbitrary regions.  (In rare
cases where this isn't true, don't use the API).

So wouldn't it be trivial to pass "amount written" to the unmap
function - to be used in the bounce buffer case?

We don't have a reliable amount to pass.

--
I have a truly marvellous patch that fixes the bug which this
signature is too narrow to contain.





reply via email to

[Prev in Thread] Current Thread [Next in Thread]