[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH 08/13] iommu: Introduce IOMMU emulation infrastr

From: Anthony Liguori
Subject: Re: [Qemu-devel] [PATCH 08/13] iommu: Introduce IOMMU emulation infrastructure
Date: Tue, 15 May 2012 18:58:42 -0500
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:11.0) Gecko/20120329 Thunderbird/11.0.1

On 05/15/2012 06:08 PM, Benjamin Herrenschmidt wrote:
On Tue, 2012-05-15 at 17:02 -0500, Anthony Liguori wrote:

"6.2.1 Register Based Invalidation Interface
The register based invalidations provides a synchronous hardware interface for
invalidations.  Software is expected to write to the IOTLB registers to submit
invalidation command and may poll on these registers to check for invalidation
completion. For optimal performance, hardware implementations are recommended to
complete an invalidation request with minimal latency"

This makes perfect sense.  You write to an MMIO location to request invalidation
and then *poll* on a separate register for completion.

It's not a single MMIO operation that has an indefinitely return duration.

Sure, it's an implementation detail, I never meant that it had to be a
single blocking register access, all I said is that the HW must provide
such a mechanism that is typically used synchronously by the operating
system. Polling for completion is a perfectly legit way to do it, that's
how we do it on the Apple G5 "DART" iommu as well.

The fact that MMIO operations can block is orthogonal, it is possible
however, especially with ancient PIO devices.

Even ancient PIO devices really don't block indefinitely.

In our case (TCEs) it's a hypervisor call, not an MMIO op, so to some
extent it's even more likely to do "blocking" things.

Yes, so I think the right thing to do is not model hypercalls for sPAPR as synchronous calls but rather as asynchronous calls. Obviously, simply ones can use a synchronous implementation...

This is a matter of setting hlt=1 before dispatching the hypercall and passing a continuation to the call that when executed, prepare the CPUState for the hypercall return and then set hlt=0 to resume the CPU.

It would have been possible to implement a "busy" return status with the
guest having to try again, unfortunately that's not how Linux has
implemented it, so we are stuck with the current semantics.

Now, if you think that dropping the lock isn't good, what do you reckon
I should do ?

Add a reference count to dma map calls and a flush_pending flag. If flush_pending && ref > 0, return NULL for all map calls.

Decrement ref on unmap and if ref = 0 and flush_pending, clear flush_pending. You could add a flush_notifier too for this event.

dma_flush() sets flush_pending if ref > 0. Your TCE flush hypercall would register for flush notifications and squirrel away the hypercall completion continuation.

VT-d actually has a concept of a invalidation completion queue which delivers interrupt based notification of invalidation completion events. The above flush_notify would be the natural way to support this since in this case, there is no VCPU event that's directly involved in the completion event.


Anthony Liguori


reply via email to

[Prev in Thread] Current Thread [Next in Thread]