qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH v3 04/35] spapr/xive: introduce a XIVE interrupt


From: David Gibson
Subject: Re: [Qemu-devel] [PATCH v3 04/35] spapr/xive: introduce a XIVE interrupt controller for sPAPR
Date: Sat, 5 May 2018 14:26:04 +1000
User-agent: Mutt/1.9.3 (2018-01-21)

On Fri, May 04, 2018 at 03:05:08PM +0200, Cédric Le Goater wrote:
> On 05/04/2018 05:33 AM, David Gibson wrote:
> > On Thu, May 03, 2018 at 06:50:09PM +0200, Cédric Le Goater wrote:
> >> On 05/03/2018 07:22 AM, David Gibson wrote:
> >>> On Thu, Apr 26, 2018 at 12:43:29PM +0200, Cédric Le Goater wrote:
> >>>> On 04/26/2018 06:20 AM, David Gibson wrote:
> >>>>> On Tue, Apr 24, 2018 at 11:46:04AM +0200, Cédric Le Goater wrote:
> >>>>>> On 04/24/2018 08:51 AM, David Gibson wrote:
> >>>>>>> On Thu, Apr 19, 2018 at 02:43:00PM +0200, Cédric Le Goater wrote:
> >>>>>>>> sPAPRXive is a model for the XIVE interrupt controller device of the
> >>>>>>>> sPAPR machine. It holds the routing XIVE table, the Interrupt
> >>>>>>>> Virtualization Entry (IVE) table which associates interrupt source
> >>>>>>>> numbers with targets.
> >>>>>>>>
> >>>>>>>> Also extend the XiveFabric with an accessor to the IVT. This will be
> >>>>>>>> needed by the routing algorithm.
> >>>>>>>>
> >>>>>>>> Signed-off-by: Cédric Le Goater <address@hidden>
> >>>>>>>> ---
> >>>>>>>>
> >>>>>>>>  May be should introduce a XiveRouter model to hold the IVT. To be
> >>>>>>>>  discussed.
> >>>>>>>
> >>>>>>> Yeah, maybe.  Am I correct in thinking that on pnv there could be more
> >>>>>>> than one XiveRouter?
> >>>>>>
> >>>>>> There is only one, the main IC. 
> >>>>>
> >>>>> Ok, that's what I thought originally.  In that case some of the stuff
> >>>>> in the patches really doesn't make sense to me.
> >>>>
> >>>> well, there is one IC per chip on powernv, but we haven't reach that part
> >>>> yet.
> >>>
> >>> Hmm.  There's some things we can delay dealing with, but I don't think
> >>> this is one of them.  I think we need to understand how multichip is
> >>> going to work in order to come up with a sane architecture.  Otherwise
> >>> I fear we'll end up with something that we either need to horribly
> >>> bastardize for multichip, or have to rework things dramatically
> >>> leading to migration nightmares.
> >>
> >> So, it is all controlled by MMIO, so we should be fine on that part. 
> >> As for the internal tables, they are all configured by firmware, using
> >> a chip identifier (block). I need to check how the remote XIVE are 
> >> accessed. I think this is by MMIO. 
> > 
> > Right, but for powernv we execute OPAL inside the VM, rather than
> > emulating its effects.  So we still need to model the actual hardware
> > interfaces.  OPAL hides the details from the kernel, but not from us
> > on the other side.
> 
> Yes. This is the case in the current model. I took a look today and
> I have a few fixes for the MMIO layout for P9 chips which I will send.
> 
> As for XIVE, the model needs to be a little more  complex to support 
> VSD_MODE_FORWARD tables which describe how to forward a notification
> to another XIVE IC on another chip. They contain an address on which 
> to load, This is another hop in the notification chain.  

Ah, ok.  So is that mode and address configured in the (bare metal)
IVT as well?  Or is that a different piece of configuration?

> >> I haven't looked at multichip XIVE support but I am not too worried as 
> >> the framework is already in place for the machine.
> >>  
> >>>>>>> If we did have a XiveRouter, I'm not sure we'd need the XiveFabric
> >>>>>>> interface, possibly its methods could just be class methods of
> >>>>>>> XiveRouter.
> >>>>>>
> >>>>>> Yes. We could introduce a XiveRouter to share the ivt table between 
> >>>>>> the sPAPRXive and the PnvXIVE models, the interrupt controllers of
> >>>>>> the machines. Methods would provide way to get the ivt/eq/nvt
> >>>>>> objects required for routing. I need to add a set_eq() to push the
> >>>>>> EQ data.
> >>>>>
> >>>>> Hrm.  Well, to add some more clarity, let's say the XiveRouter is the
> >>>>> object which owns the IVT.  
> >>>>
> >>>> OK. that would be a model with some state and not an interface.
> >>>
> >>> Yes.  For papr variant it would have the whole IVT contents as its
> >>> state.  For the powernv, just the registers telling it where to find
> >>> the IVT in RAM.
> >>>
> >>>>> It may or may not do other stuff as well.
> >>>>
> >>>> Its only task would be to do the final event routing: get the IVE,
> >>>> get the EQ, push the EQ DATA in the OS event queue, notify the CPU.
> >>>
> >>> That seems like a lot of steps.  Up to push the EQ DATA, certainly.
> >>> And I guess it'll have to ping an NVT somehow, but I'm not sure it
> >>> should know about CPUs as such.
> >>
> >> For PowerNV, the concept could be generalized, yes. An NVT can 
> >> contain the interrupt state of a logical server but the common 
> >> case is baremetal without guests for QEMU and so we have a NVT 
> >> per cpu. 
> > 
> > Hmm.  We eventually want to support a kernel running guests under
> > qemu/powernv though, right?  
> 
> arg. an emulated hypervisor ! OK let's say this is a long term goal :) 
> 
> > So even if we don't allow it right now,
> > we don't want allowing that to require major surgery to our
> > architecture.
> 
> That I agree on. 
> 
> >> PowerNV will have some limitation but we can make it better than 
> >> today for sure. It boots.
> >>
> >> We can improve some of the NVT notification process, the way NVT 
> >> are matched eventually. may be support remote engines if the
> >> NVT is not local. I have not looked at the details.
> >>
> >>> I'm not sure at this stage what should own the EQD table.
> >>
> >> The EQDT is in RAM.
> > 
> > Not for spapr, it's not.  
> 
> yeah ok. It's in QEMU/KVM.
> 
> > And even when it is in RAM, something needs
> > to own the register that gives its base address.
> 
> It's more complex than registers on powernv. There is a procedure
> to define the XIVE tables using XIVE table descriptors which contain
> their characteristics, size, indirect vs. indirect, local vs remote.
> OPAL/skiboot defines all these to configure the HW, and the model
> necessarily needs to support the same interface. This is the case
> for a single chip.

Ah, ok.  So there's some sort of IVTD.  Also in RAM?  Eventually there
must be a register giving the base address of the IVTD, yes?

-- 
David Gibson                    | I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
                                | _way_ _around_!
http://www.ozlabs.org/~dgibson

Attachment: signature.asc
Description: PGP signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]