[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH qemu] spapr_pci: Disable IRQFD resampling on XIVE

From: Cédric Le Goater
Subject: Re: [PATCH qemu] spapr_pci: Disable IRQFD resampling on XIVE
Date: Wed, 27 Apr 2022 09:36:41 +0200
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.7.0

Hello Alexey,

On 4/27/22 06:36, Alexey Kardashevskiy wrote:
VFIO-PCI has an "KVM_IRQFD_FLAG_RESAMPLE" optimization for INTx EOI
handling when KVM can unmask PCI INTx (level triggered interrupt) without
switching to the userspace (==QEMU).

Unfortunately XIVE does not support level interrupts,

That's not correctly phrased I think.

The QEMU XIVE device support LSIs but the POWER9 kernel-irqchips,
KVM XICS-on-XIVE and XIVE native devices, are broken with respect
to passthrough adapters using INTx.

QEMU emulates them
and therefore there is no existing code path to kick the resamplefd.
The problem appears when passing through a PCI adapter with
the "pci=nomsi" kernel parameter - the adapter's interrupt interrupt
count in /proc/interrupts will stuck at "1".

This disables resampler when the XIVE interrupt controller is configured.
This should not be very visible though KVM already exits to QEMU for INTx
and XIVE-capable boxes (POWER9 and newer) do not seem to have
performance-critical INTx-only capable devices.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>

Cédric, this is what I meant when I said that spapr_pci.c was unaware of
the interrupt controller type, neither xics nor xive was mentioned
in the file before.

  hw/ppc/spapr_pci.c | 14 +++++++++++---
  1 file changed, 11 insertions(+), 3 deletions(-)

diff --git a/hw/ppc/spapr_pci.c b/hw/ppc/spapr_pci.c
index 5bfd4aa9e5aa..2675052601db 100644
--- a/hw/ppc/spapr_pci.c
+++ b/hw/ppc/spapr_pci.c
@@ -729,11 +729,19 @@ static void pci_spapr_set_irq(void *opaque, int irq_num, 
int level)
static PCIINTxRoute spapr_route_intx_pin_to_irq(void *opaque, int pin)
+    SpaprMachineState *spapr = SPAPR_MACHINE(qdev_get_machine());
      SpaprPhbState *sphb = SPAPR_PCI_HOST_BRIDGE(opaque);
-    PCIINTxRoute route;
+    PCIINTxRoute route = { .mode = PCI_INTX_DISABLED };
- route.mode = PCI_INTX_ENABLED;
-    route.irq = sphb->lsi_table[pin].irq;
+    /*
+     * Disable IRQFD resampler on XIVE as it does not support LSI and QEMU
+     * emulates those so the KVM kernel resamplefd kick is skipped and EOI
+     * is not delivered to VFIO-PCI.
+     */
+    if (!spapr->xive) {

This is testing the availability of the XIVE interrupt mode, but not
the activate controller. See spapr_irq_init() which is called very
early in the machine initialization.

Is that what we want ? Is everything fine if we start the machine with
ic-mode=xics ? On a POWER9 host, this would use the KVM XICS-on-XIVE
device which is broken also AFAICT.

You should extend the SpaprInterruptControllerClass (for a routine) or
simply SpaprIrq (for a bool) if you need to handle IRQ matters from a
device model.



+        route.mode = PCI_INTX_ENABLED;
+        route.irq = sphb->lsi_table[pin].irq;
+    }
return route;

reply via email to

[Prev in Thread] Current Thread [Next in Thread]