[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH 2/3] VFIO: Clear INTx pending state on EEH reset

From: Gavin Shan
Subject: Re: [Qemu-devel] [PATCH 2/3] VFIO: Clear INTx pending state on EEH reset
Date: Mon, 16 Mar 2015 12:04:59 +1100
User-agent: Mutt/1.5.21 (2010-09-15)

On Fri, Mar 13, 2015 at 03:51:27PM -0600, Alex Williamson wrote:
>On Wed, 2015-03-11 at 17:11 +1100, Gavin Shan wrote:
>> When Linux guest recovers from EEH error on the following Emulex
>> adapter, the MSIx interrupts are disabled and the INTx emulation
>> is enabled. One INTx interrupt is injected to the guest by host
>> because of detected pending INTx interrupts on the adapter. QEMU
>> disables mmap'ed BAR regions and starts a timer to enable those
>> regions at later point the INTx interrupt handler. Unfortunately,
>> "VFIOPCIDevice->intx.pending" isn't cleared, meaning those disabled
>> mapp'ed BAR regions won't be reenabled properly. It leads to EEH
>> recovery failure at guest side because of hanged MMIO access.
>>  # lspci | grep Emulex
>>  0000:01:00.0 Ethernet controller: Emulex Corporation \
>>               OneConnect 10Gb NIC (be3) (rev 02)
>>  0000:01:00.1 Ethernet controller: Emulex Corporation \
>>               OneConnect 10Gb NIC (be3) (rev 02)
>> The patch clears "VFIOPCIDevice->intx.pending" after EEH reset
>> is completed on the PE, which contains the adapter. In turn, the
>> mmap'ed BAR regions can be reenabled to avoid EEH recovery failure.
>> Signed-off-by: Gavin Shan <address@hidden>
>> ---
>>  hw/vfio/pci.c | 14 ++++++++++++++
>>  1 file changed, 14 insertions(+)
>> diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
>> index 8c4a8cb..55e0904 100644
>> --- a/hw/vfio/pci.c
>> +++ b/hw/vfio/pci.c
>> @@ -3352,6 +3352,20 @@ int vfio_container_eeh_event(AddressSpace *as, 
>> int32_t groupid,
>>          }
>>          break;
>> +        /*
>> +         * We might have INTx interrupt whose handler disabled the
>> +         * memory mapped BARs. Without clearing the INTx pending
>> +         * state, the timer kicked by the INTx interrupt handler
>> +         * won't enable those disabled memory mapped BARs, which
>> +         * leads EEH recovery failure.
>> +         */
>> +        QLIST_FOREACH(vbasedev, &group->device_list, next) {
>> +            vdev = container_of(vbasedev, VFIOPCIDevice, vbasedev);
>> +            vdev->intx.pending = false;
>> +        }
>I'm nervous that "pending" is trying to track that a) the host interrupt
>is masked and b) the emulated INTx line for the device is asserted, but
>we're not clearing the state of any of that here.  We can handle a
>spurious EOI, the device should simply re-assert the interrupt, but
>changing one piece of tracking w/o getting everything in sync seems like
>a looming bug.  Thanks,

After the PCI device takes EEH reset, we shouldn't see pending INTx from
hardware side. So I think it's safe to clear "intx.pending" when EEH reset
is completed. However, I think I have to explain more about how this issue
was found and my understanding:

(1) Guest detects EEH errors. The device driver disables MSIx interrupt by
calling pci_disable_msix(), which disable MSIx vectors and then enable

(2) QEMU sends IOCTL commands to host to disable MSIx and enable INTx. At
this stage the INTx is still masked. At later point, the guest is requesting
unmasking INTx, which is captured by host. Host checks and founds pending
INTx, which is sent to QEMU. In QEMU INTx handler (vfio_intx_interrupt()),
the mmap'ed regions are disabled, "intx.pending" is set and a timer is started
to reenable mmap'ed regions if "intx.pending" is cleared there. However,
"intx.pending" is only cleared upon BAR access in slow path, which is never

(3) After guest disables MSIx and issue EEH reset, the device driver starts
to check its firmware state by reading MMIO register, which isn't completed
by QEMU VFIO BAR slow path (Note: fast path supported by mmaped regions have
been disabled). Eventually, the guest hangs on reading MMIO register. With
this patch applied to QEMU, I didn't see the problem again. 


>> +
>> +        break;
>>      }
>>      vfio_put_group(group);

reply via email to

[Prev in Thread] Current Thread [Next in Thread]