qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [RFC PATCH] spapr/xive: Allocate vCPU IPIs from local context


From: Greg Kurz
Subject: Re: [RFC PATCH] spapr/xive: Allocate vCPU IPIs from local context
Date: Fri, 24 Sep 2021 15:49:06 +0200

On Fri, 24 Sep 2021 14:40:24 +0200
Cédric Le Goater <clg@kaod.org> wrote:

> On 9/23/21 11:12, Greg Kurz wrote:
> > On Wed, 22 Sep 2021 16:41:20 +0200
> > Cédric Le Goater <clg@kaod.org> wrote:
> > 
> >> When QEMU switches to the XIVE interrupt mode, it creates all possible
> >> guest interrupts at the level of the KVM device. These interrupts are
> >> backed by real HW interrupts from the IPI interrupt pool of the XIVE
> >> controller.
> >>
> >> Currently, this is done from the QEMU main thread, which results in
> >> allocating all interrupts from the chip on which QEMU is running. IPIs
> >> are not distributed across the system and the load is not well
> >> balanced across the interrupt controllers.
> >>
> >> To improve distribution on the system, we should try to allocate the
> >> underlying HW IPI from the vCPU context. This also benefits to
> >> configurations using CPU pinning. In this case, the HW IPI is
> >> allocated on the local chip, rerouting between interrupt controllers
> >> is reduced and performance improved.
> >>
> >> This moves the initialization of the vCPU IPIs from reset time to the
> >> H_INT_SET_SOURCE_CONFIG hcall which is called from the vCPU context.
> >> But this needs some extra checks in the sequences getting and setting
> >> the source states to make sure a valid HW IPI is backing the guest
> >> interrupt. For that, we check if a target was configured in the END in
> >> case of a vCPU IPI.
> >>
> >> Signed-off-by: Cédric Le Goater <clg@kaod.org>
> >> ---
> >>
> >>   I have tested different SMT configurations, kernel_irqchip=off/on,
> >>   did some migrations, CPU hotplug, etc. It's not enough and I would
> >>   like more testing but, at least, it is not making anymore the bad
> >>   assumption vCPU id = IPI number.
> >>
> > 
> > Yeah, the IPI number is provided by the guest, so h_int_set_source_config()
> > is really the only place where we can know the IPI number of a given vCPU.
> 
> The patch lacks a run_on_cpu() to perform the reset on the vCPU context
> to be complete.
> 

Yes since the vCPU doing the hcall is obviously not the target for the
IPI :-)

> > 
> >>   Comments ?
> >>
> > 
> > LGTM but I didn't check if more users of xive_end_is_valid() should
> > be converted to using xive_source_is_initialized().
> 
> I think you mean xive_eas_is_valid() ?
> 

Oops yes... bad copy/paste :-\

> The changes only impact KVM support since we are deferring the IRQ
> initialization at the KVM level. What we have to be careful about is not
> accessing an ESB page of an interrupt that would not have been initiliazed
> in the KVM device, else QEMU gets a sigbus.
> 

Ok, so this is just needed on a path that leads to xive_esb_rw() or
kvmppc_xive_esb_trigger(), right ?

It seems that

h_int_esb()
 kvmppc_xive_esb_rw()

can get there with an lisn provided by the guest, and I don't see any
such check on the way : h_int_esb() only checks xive_eas_is_valid().

Cheers,

--
Greg

> That only happens when QEMU gets/sets the ESB states.
>   
> > Any chance you have some perf numbers to share ?
> 
> I will try.
> 
> Thanks,
> 
> C.
> 
>   
> >>   hw/intc/spapr_xive.c     | 17 +++++++++++++++++
> >>   hw/intc/spapr_xive_kvm.c | 36 +++++++++++++++++++++++++++++++-----
> >>   2 files changed, 48 insertions(+), 5 deletions(-)
> >>
> >> diff --git a/hw/intc/spapr_xive.c b/hw/intc/spapr_xive.c
> >> index 6f31ce74f198..2dc594a720b1 100644
> >> --- a/hw/intc/spapr_xive.c
> >> +++ b/hw/intc/spapr_xive.c
> >> @@ -1089,6 +1089,23 @@ static target_ulong 
> >> h_int_set_source_config(PowerPCCPU *cpu,
> >>       if (spapr_xive_in_kernel(xive)) {
> >>           Error *local_err = NULL;
> >>   
> >> +        /*
> >> +         * Initialize the vCPU IPIs from the vCPU context to allocate
> >> +         * the backing HW IPI on the local chip. This improves
> >> +         * distribution of the IPIs in the system and when the vCPUs
> >> +         * are pinned, it reduces rerouting between interrupt
> >> +         * controllers for better performance.
> >> +         */
> >> +        if (lisn < SPAPR_XIRQ_BASE) {
> >> +            XiveSource *xsrc = &xive->source;
> >> +
> >> +            kvmppc_xive_source_reset_one(xsrc, lisn, &local_err);
> >> +            if (local_err) {
> >> +                error_report_err(local_err);
> >> +                return H_HARDWARE;
> >> +            }
> >> +        }
> >> +
> >>           kvmppc_xive_set_source_config(xive, lisn, &new_eas, &local_err);
> >>           if (local_err) {
> >>               error_report_err(local_err);
> >> diff --git a/hw/intc/spapr_xive_kvm.c b/hw/intc/spapr_xive_kvm.c
> >> index 53731d158625..1607a59e9483 100644
> >> --- a/hw/intc/spapr_xive_kvm.c
> >> +++ b/hw/intc/spapr_xive_kvm.c
> >> @@ -254,7 +254,12 @@ static int kvmppc_xive_source_reset(XiveSource *xsrc, 
> >> Error **errp)
> >>       SpaprXive *xive = SPAPR_XIVE(xsrc->xive);
> >>       int i;
> >>   
> >> -    for (i = 0; i < xsrc->nr_irqs; i++) {
> >> +    /*
> >> +     * vCPU IPIs are initialized at the KVM level when configured by
> >> +     * H_INT_SET_SOURCE_CONFIG.
> >> +     */
> >> +
> >> +    for (i = SPAPR_XIRQ_BASE; i < xsrc->nr_irqs; i++) {
> >>           int ret;
> >>   
> >>           if (!xive_eas_is_valid(&xive->eat[i])) {
> >> @@ -342,6 +347,27 @@ uint64_t kvmppc_xive_esb_rw(XiveSource *xsrc, int 
> >> srcno, uint32_t offset,
> >>       }
> >>   }
> >>   
> >> +static bool xive_source_is_initialized(SpaprXive *xive, int lisn)
> >> +{
> >> +    assert(lisn < xive->nr_irqs);
> >> +
> >> +    if (!xive_eas_is_valid(&xive->eat[lisn])) {
> >> +        return false;
> >> +    }
> >> +
> >> +    /*
> >> +     * vCPU IPIs are initialized at the KVM level when configured by
> >> +     * H_INT_SET_SOURCE_CONFIG, in which case, we should have a valid
> >> +     * target (server, priority) in the END.
> >> +     */
> >> +    if (lisn < SPAPR_XIRQ_BASE) {
> >> +        return !!xive_get_field64(EAS_END_INDEX, xive->eat[lisn].w);
> >> +    }
> >> +
> >> +    /* Device sources */
> >> +    return true;
> >> +}
> >> +
> >>   static void kvmppc_xive_source_get_state(XiveSource *xsrc)
> >>   {
> >>       SpaprXive *xive = SPAPR_XIVE(xsrc->xive);
> >> @@ -350,7 +376,7 @@ static void kvmppc_xive_source_get_state(XiveSource 
> >> *xsrc)
> >>       for (i = 0; i < xsrc->nr_irqs; i++) {
> >>           uint8_t pq;
> >>   
> >> -        if (!xive_eas_is_valid(&xive->eat[i])) {
> >> +        if (!xive_source_is_initialized(xive, i)) {
> >>               continue;
> >>           }
> >>   
> >> @@ -533,7 +559,7 @@ static void kvmppc_xive_change_state_handler(void 
> >> *opaque, bool running,
> >>               uint8_t pq;
> >>               uint8_t old_pq;
> >>   
> >> -            if (!xive_eas_is_valid(&xive->eat[i])) {
> >> +            if (!xive_source_is_initialized(xive, i)) {
> >>                   continue;
> >>               }
> >>   
> >> @@ -561,7 +587,7 @@ static void kvmppc_xive_change_state_handler(void 
> >> *opaque, bool running,
> >>       for (i = 0; i < xsrc->nr_irqs; i++) {
> >>           uint8_t pq;
> >>   
> >> -        if (!xive_eas_is_valid(&xive->eat[i])) {
> >> +        if (!xive_source_is_initialized(xive, i)) {
> >>               continue;
> >>           }
> >>   
> >> @@ -666,7 +692,7 @@ int kvmppc_xive_post_load(SpaprXive *xive, int 
> >> version_id)
> >>   
> >>       /* Restore the EAT */
> >>       for (i = 0; i < xive->nr_irqs; i++) {
> >> -        if (!xive_eas_is_valid(&xive->eat[i])) {
> >> +        if (!xive_source_is_initialized(xive, i)) {
> >>               continue;
> >>           }
> >>   
> > 
> 




reply via email to

[Prev in Thread] Current Thread [Next in Thread]