qemu-ppc
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-ppc] [PATCH v2 02/19] spapr: introduce a skeleton for the XIVE


From: David Gibson
Subject: Re: [Qemu-ppc] [PATCH v2 02/19] spapr: introduce a skeleton for the XIVE interrupt controller
Date: Mon, 16 Apr 2018 14:26:05 +1000
User-agent: Mutt/1.9.2 (2017-12-15)

On Thu, Apr 12, 2018 at 10:18:11AM +0200, Cédric Le Goater wrote:
> On 04/12/2018 07:07 AM, David Gibson wrote:
> > On Wed, Dec 20, 2017 at 08:38:41AM +0100, Cédric Le Goater wrote:
> >> On 12/20/2017 06:09 AM, David Gibson wrote:
> >>> On Sat, Dec 09, 2017 at 09:43:21AM +0100, Cédric Le Goater wrote:
> >>>> With the POWER9 processor comes a new interrupt controller called
> >>>> XIVE. It is composed of three sub-engines :
> >>>>
> >>>>   - Interrupt Virtualization Source Engine (IVSE). These are in PHBs,
> >>>>     in the main controller for the IPIS and in the PSI host
> >>>>     bridge. They are configured to feed the IVRE with events.
> >>>>
> >>>>   - Interrupt Virtualization Routing Engine (IVRE). Their job is to
> >>>>     match an event source with a Notification Virtualization Target
> >>>>     (NVT), a priority and an Event Queue (EQ) to determine if a
> >>>>     Virtual Processor can handle the event.
> >>>>
> >>>>   - Interrupt Virtualization Presentation Engine (IVPE). It maintains
> >>>>     the interrupt state of each hardware thread and present the
> >>>>     notification as an external exception.
> >>>>
> >>>> Each of the engines uses a set of internal tables to redirect
> >>>> exceptions from event sources to CPU threads. The first table we
> >>>> introduce is the Interrupt Virtualization Entry (IVE) table, part of
> >>>> the virtualization engine in charge of routing events. It associates
> >>>> event sources (IRQ numbers) to event queues which will forward, or
> >>>> not, the event notification to the presentation controller.
> >>>>
> >>>> The XIVE model is designed to make use of the full range of the IRQ
> >>>> number space and does not use an offset like the XICS mode does.
> >>>> Hence, the IVE table is directly indexed by the IRQ number.
> >>>>
> >>>> Signed-off-by: Cédric Le Goater <address@hidden>
> >>>
> >>> As you've suggested in yourself, I think we might need to more
> >>> explicitly model the different components of the XIVE system.  As part
> >>> of that, I think you need to be clearer in this base skeleton about
> >>> exactly what component your XIVE object represents.
> > 
> > Sorry it's been so long since I looked at these.
> 
> That's fine. I have been working on a XIVE device model for the PowerNV
> machine and KVM support for the pseries. I have a better understanding
> of the overall picture.
> 
> The patchset has not changed much so we can still discuss on this
> basis without me flooding the mailing list.
> 
> >> ok. The base skeleton is the IVRE, the central engine handling 
> >> the routing. 
> >>
> >>> If the answer is "the overall thing" 
> >>
> >> Yes, it is more or less that currently. 
> >>
> >> The sPAPRXive object models the source engine and the routing 
> >> engine in one object.
> > 
> > Yeah, I suspect we don't want that.  Although it might seem simpler in
> > the spapr case, at least at first glance, I think it will cause us
> > problems later.  At the very least, it's likely to make it harder to
> > share code between the spapr and powernv case.  I think it will also
> > make for more confusion about exactly what things belong where.
> 
> I tend to agree. 
> 
> We need to clarify (a bit) what is in the XIVE interrupt controller 
> silicon, and how XIVE works. The XIVE device models for spapr and 
> powernv should be very close as the differences are small. KVM support 
> should be built on the spapr model.
> 
> There are 3 different sub-engines in the XIVE interrupt controller
> device :
> 
> * IVSE (XiveSource model)
> 
>   interrupt sources, which expose their PQ bits through ESB MMIO pages 
>   (there are different levels of support depending on HW revision) 
> 
>   The XIVE interrupt controller has a set of internal sources for 
>   IPIs and CAPI like interrupts.

Ok.  IIUC in hardware there's one of these in each PHB, plus maybe one
or two others.  Is that right?

> 
> * IVRE (No real model)
> 
>   in the middle, doing the routing of source event notification to
>   (cpu) targets. It relies on internal tables which are stored in 
>   the hypervisor/QEMU/KVM for the spapr machine and in the VM RAM 
>   for the powernv machine.

What does VM RAM mean in the powernv context?

>   Configuration updates of the XIVE tables are done through hcalls 
>   on spapr and with MMIOs on the IC regs on powernv. On the latter,
>   the changes are flushed backed in the VM RAM. 
> 
> * IVPE (XiveNVT)
> 
>   set of registers for interrupt management at the CPU level. Exposed
>   in a specific MMIO region called the TIMA.

Ok.

> The XIVE tables are :
> 
> * IVT
> 
>   associate an interrupt source number with an event queue. the data
>   to be pushed in the queue is stored there also.

Ok, so there would be one of these tables for each IVRE, with one
entry for each source managed by that IVSE, yes?

Do the XIVE IPIs have entries here, or do they bypass this?

> * EQDT:
> 
>   describes the queues in the OS RAM, also contains a set of flags,
>   a virtual target, etc.

So on real hardware this would be global, yes?  And it would be
consulted by the IVRE?

For guests, we'd expect one table per-guest?  How would those be
integrated with the host table?

> * VPDT:
> 
>   describe the virtual targets, which can have different natures,
>   a lpar, a cpu. This is for powernv, spapr does not have this 
>   concept.

Ok  On hardware that would also be global and consulted by the IVRE,
yes?

Under PAPR, I'm guessing the concept is missing because it essentially
has a fixed contents: an entry for each vcpu and maybe one for the
lpar as a whole?

> So, the idea behind the sPAPRXive object is to model a XIVE interrupt
> controller device. It contains today :

Yeah, what a "XIVE interrupt controller device" is not really clear to
me.  If it's something that is necessarily global, I think you'll be
better off making it a machine-interface rather than a distinct
object.

> 
>  - an internal source block for all interrupts : IPIs and virtual 
>    device interrupts. In the IRQ number space, the IPIs are below
>    4096 and the device interrupts above, which keeps compatibility 
>    with XICS. This is important to be able to change interrupt mode.
> 
>    PowerNV has different source blocks, like for P8.
> 
>  - a routing engine, which is limited to the IVT. This is a shortcut 
>    and it might be better to introduce a specific object. Anyhow, this 
>    is a state to capture.

Ok.  It sounds like this is roughly the equivalent of the XICSFabric,
and likewise would probably be better handled by an interface on the
machine rather than a distinct object.  But I'm not clear enough to be
certain of that yet.

>    In the current version I am working on, the XiveFabric interface is
>    more complex :
> 
>       typedef struct XiveFabricClass {
>           InterfaceClass parent;
>           XiveIVE *(*get_ive)(XiveFabric *xf, uint32_t lisn);

This does an IVT lookup, I take it?

>           XiveNVT *(*get_nvt)(XiveFabric *xf, uint32_t server);

This one a VPDT lookup, yes?

>           XiveEQ  *(*get_eq)(XiveFabric *xf, uint32_t eq_idx);

And this one an EQDT lookup?

>       } XiveFabricClass;
> 
>    It helps in making the routing algorithm independent of the model. 
>    I hope to make powernv converge and use it.
> 
>  - a set of MMIOs for the TIMA. They model the presenter engine. 
>    current_cpu is used to retrieve the NVT object, which holds the 
>    registers for interrupt management.  

Right.  Now the TIMA is local to a target/server not an EQ, right?

I guess we need at least one of these per-vcpu.  Do we also need an
lpar-global, or other special ones?

> The EQs are stored under the NVT. This saves us an unnecessary EQDT 
> table. But we could add one under the XIVE device model.

I'm not sure of the distinction you're drawing between the NVT and the
XIVE device mode.

[snip]

-- 
David Gibson                    | I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
                                | _way_ _around_!
http://www.ozlabs.org/~dgibson

Attachment: signature.asc
Description: PGP signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]