[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH for-6.0 v2 2/3] spapr/xive: Fix size of END table and number

From: Greg Kurz
Subject: Re: [PATCH for-6.0 v2 2/3] spapr/xive: Fix size of END table and number of claimed IPIs
Date: Thu, 3 Dec 2020 11:50:30 +0100

On Thu, 3 Dec 2020 10:51:10 +0100
Cédric Le Goater <clg@kaod.org> wrote:

> On 11/30/20 7:07 PM, Cédric Le Goater wrote:
> > On 11/30/20 5:52 PM, Greg Kurz wrote:
> >> The sPAPR XIVE device has an internal ENDT table the size of
> >> which is configurable by the machine. This table is supposed
> >> to contain END structures for all possible vCPUs that may
> >> enter the guest. The machine must also claim IPIs for all
> >> possible vCPUs since this is expected by the guest.
> >>
> >> spapr_irq_init() takes care of that under the assumption that
> >> spapr_max_vcpu_ids() returns the number of possible vCPUs.
> >> This happens to be the case when the VSMT mode is set to match
> >> the number of threads per core in the guest (default behavior).
> >> With non-default VSMT settings, this limit is > to the number
> >> of vCPUs. In the worst case, we can end up allocating an 8 times
> >> bigger ENDT and claiming 8 times more IPIs than needed. But more
> >> importantly, this creates a confusion between number of vCPUs and
> >> vCPU ids, which can lead to subtle bugs like [1].
> >>
> >> Use smp.max_cpus instead of spapr_max_vcpu_ids() in
> >> spapr_irq_init() for the latest machine type. Older machine
> >> types continue to use spapr_max_vcpu_ids() since the size of
> >> the ENDT is migration visible.
> >>
> >> [1] https://bugs.launchpad.net/qemu/+bug/1900241
> >>
> >> Signed-off-by: Greg Kurz <groug@kaod.org>
> > 
> > 
> > Reviewed-by: Cédric Le Goater <clg@kaod.org>
> I gave patch 2 and 3 a little more thinking. 
> I don't think we need much more than patch 1 which clarifies the 
> nature of the values being manipulated, quantities vs. numbering.
> The last 2 patches are adding complexity to try to optimize the 
> XIVE VP space in a case scenario which is not very common (vSMT). 
> May be it's not worth it. 

Well, the motivation isn't about optimization really since
a non-default vSMT setting already wastes VP space because
of the vCPU spacing. This is more about not using values
with wrong semantics in the code to avoid confusion in
future changes.

I agree though that the extra complexity, especially the
compat crust, might be excessive. So maybe I should just
add comments in the code to clarify when we're using the
wrong semantics ?

> Today, we can start 4K (-2) KVM guests with 16 vCPUs each on 
> a witherspoon (2 socket P9) and we are far from reaching the 
> limits of the VP space. Available RAM is more a problem. 
> VP space is even bigger on P10. The width was increased to 24bit 
> per chip.
> C.

reply via email to

[Prev in Thread] Current Thread [Next in Thread]