[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH v7 3/3] sPAPR: EEH support for VFIO PCI device

From: Alexander Graf
Subject: Re: [Qemu-devel] [PATCH v7 3/3] sPAPR: EEH support for VFIO PCI device
Date: Wed, 28 May 2014 15:16:09 +0200
User-agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.9; rv:24.0) Gecko/20100101 Thunderbird/24.5.0

On 28.05.14 14:33, Gavin Shan wrote:
On Wed, May 28, 2014 at 01:24:15PM +0200, Alexander Graf wrote:
On 28.05.14 06:12, Gavin Shan wrote:
On Wed, May 28, 2014 at 12:41:37AM +0200, Alexander Graf wrote:
On 28.05.14 00:27, Benjamin Herrenschmidt wrote:
On Wed, 2014-05-28 at 00:22 +0200, Alexander Graf wrote:

In any case, the above isn't the problem, we register rtas functions
called rtas_ibm_*, that's fine.
They're called rtas_ibm, not rtas_vfio. They simply live in the wrong
file for starters. Once you start moving them out of *vfio*.c and
make sure you don't sneak in vfio headers into spapr_rtas.c all the
abstraction should come naturally.

The problem you have is with what's inside these functions, correct ?
You want some kind of PCI Dev ops... am I understanding properly ?
Instead of having them call the VFIO ioctl's directly.
We have a nice bus hierarchy. The RTAS call tells you the PHB's ID.
The machine knows about all of its PHBs, so you can find that one.

>From the PHB we can then find a PCIDevice by identifier if we
received one in the RTAS call.

If we received a PE, we need to act on a virtual PE object (not
available yet I think) which then would call at least into each VFIO
container associated with that guest PE. We don't want to call into
devices here, since we could have 2 VFIO devices in one PE and don't
want to do a PE level operation twice.

If we want to simplify things (and I'm fine with that), we can also
restrict a "guest PE" to only span at most a single VFIO container
context and unlimited emulated devices. That way we don't need to
worry about potential side effects.

I can't tell you what the "guest PE" would best be modeled at. Maybe
a simple list/hash inside the PHB is the answer. But we need to
ensure that the guest knows about its scope of action. And we need to
make sure that we maintain abstraction levels to keep at least a
minimum level of sanity for the reader.

sPAPRVFIOPHBState has one-to-one mapping with container, which has
one-to-one mapping relationship with IOMMU group on sPAPR platform.
So sPAPRVFIOPHBState is representing "guest PE".
So each PHB only spans a single PE?


Yes. RTAS calls from guest, 2 types of address information as input there:
PHB BUID+device's PCI config address, or PHB BUID+PE address. After removing
the logic of translating device's PCI config address to PE address from
host to QEMU, we don't have to care much about the first case (PHB BUID+
device's PCI config address). That means we only have to care "PE address",
which is basically the IOMMU group ID (plus fixed offset) of sPAPRPHBVFIOState.
So the BUID can be used to locate sPAPRPHBVFIOState and "PE address" can
help doing same thing.
... then we can just ignore the "PE address", no?

Yes, we can ignore that. But I plan to use it for more sanity check, which
isn't harmful.

As you mentioned above, spapr_rtas.c can't include "vfio.h". I guess I have
to rework on PCIDevice/PCIDeviceClass for a bit as follows. Alex, could you
help confirming it's what you're expecting to see?
I depends. If you need a PCI device level callback, then yes. If you
don't need a PCI level callback, just have one in the PHB. I'm not
quite sure how you can find the "VFIO container" that belongs to that
PHB though, but I'm sure you'll figure that out :).

Yeah, I'll have PHB level callback after having a discussion with
Alexey who gave good idea to make the abstraction as follows. Please
let me know if I'm in wrong direction again because I don't have QEMU
experience at all:

        RTAS call -> hw/ppc/spapr_pci.c where RTAS calls get registered
                   -> sPAPRPHBClass::rtas_eeh_handler() in 
                   -> vfio_container_ioctl() in hw/misc/vfio.c

I think that would make sense, yes :).


reply via email to

[Prev in Thread] Current Thread [Next in Thread]