qemu-ppc
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-ppc] [RFC PATCH v2 00/21] Guest exploitation of the XIVE inter


From: David Gibson
Subject: Re: [Qemu-ppc] [RFC PATCH v2 00/21] Guest exploitation of the XIVE interrupt controller (POWER9)
Date: Thu, 28 Sep 2017 23:17:44 +1000
User-agent: Mutt/1.9.0 (2017-09-02)

On Thu, Sep 28, 2017 at 10:23:22AM +0200, Benjamin Herrenschmidt wrote:
> On Wed, 2017-09-20 at 14:33 +0200, Cédric Le Goater wrote:
> > > > I'm thinking maybe trying to support the CAS negotiation of interrupt
> > > > controller from day 1 is warping the design.  A better approach might
> > > > be first to implement XIVE only when given a specific machine option -
> > > > guest gets one or the other and can't negotiate.
> > 
> > ok. 
> > 
> > CAS is not the most complex problem, we mostly need to share 
> > the ICSIRQState array and the source offset. migration from older
> > machine is a problem. We are doomed to keep the existing XICS
> > framework available.
> 
> I don't like sharing anything. I'd rather we had separate objects
> alltogether. If needed we can implement CAS by doing a partition reboot
> like pHyp does, at least initially, until we add ways to tear down and
> rebuild objects.

Right, I agree.  The difficulty isn't really CAS reboot or not, it's
more that altering the virtual hardware at runtime is.. awkward.. in
qemu.  And then there's the issue of migrating the state, which also
gets a bit complex.

As you've seen elsewhere, I think we need to get the XIVE model right
on its own first, then worry about those issues.

> The main issue is whether we can keep a consistent number space so the
> DT doesn't have to be completely rebuilt. If it does, then reboot will
> be the only practical option I'm afraid.

I think it should be possible to make a consistent number space.  At
present the irq allocation is kind of tied to xics, but I think that's
fixable.

> > > > That should allow a more natural XIVE design to emerge, *then* we can
> > > > look at what's necessary to make boot-time negotiation possible.
> > > 
> > > Actually, it just occurred to me that we might be making life hard for
> > > ourselves by trying to actually switch between full XICS and XIVE
> > > models.  Coudln't we have new machine types always construct the XIVE
> > > infrastructure, 
> > 
> > yes.
> > 
> > > but then implement the XICS RTAS and hcalls in terms of the XIVE virtual 
> > > hardware.
> 
> That's gross :-)
> 
> This is also exactly what KVM does with real XIVE HW and there's also
> such an emulation in OPAL. I'd be weary of creating a 3rd one...
> 
> I'd much prefer if we managed to:
> 
>  - Split the source numbering from the various state tracking objects
> so we can have that common
> 
>  - Either delay the creation to after CAS or tear down & re-create the
> state tracking objects at CAS time.
> 
> > ok but migration will not be supported.
> > 
> > > Since something more or less equivalent
> > > has already been done in both OPAL and the host kernel, I'm guessing
> > > this shouldn't be too hard at this point.
> 
> It would very much suck to have yet another one of these.

Hm, ok.

> Also we need to understand how that would work in a KVM context, the
> kernel will provide a "XICS" state even on top of XIVE unless we switch
> the kernel object to native, but then the kernel will expect full
> exploitation.
> 
> > Indeed that is how it is working currently on P9 kvm guests. hcalls are
> > implemented on top of XIVE native.
> > 
> > Thanks,
> > 
> > 
> > C.
> 

-- 
David Gibson                    | I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
                                | _way_ _around_!
http://www.ozlabs.org/~dgibson

Attachment: signature.asc
Description: PGP signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]