[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] [PATCH] Xen PCI passthrough: fix passthrough failure wh
From: |
Jan Beulich |
Subject: |
Re: [Qemu-devel] [PATCH] Xen PCI passthrough: fix passthrough failure when irq map failure |
Date: |
Fri, 23 Nov 2018 03:26:44 -0700 |
>>> On 23.11.18 at 11:19, <address@hidden> wrote:
> Adding Jan in case he has an opinion on my reply below.
I agree, fwiw.
Jan
> On Fri, Nov 23, 2018 at 12:04:51AM -0500, Zhao Yan wrote:
>> On Thu, Nov 22, 2018 at 03:18:05PM +0100, Roger Pau Monné wrote:
>> > On Thu, Nov 22, 2018 at 08:11:20AM -0500, Zhao Yan wrote:
>> > > On Thu, Oct 18, 2018 at 03:56:36PM +0100, Roger Pau Monné wrote:
>> > > > On Thu, Oct 18, 2018 at 08:22:41AM +0000, Zhao, Yan Y wrote:
>> > > > > Hi
>> > > > > The background for this patch is that: for some pci device, even
>> > > > > it's PCI_INTERRUPT_PIN is not 0, it
actually does not support INTx mode, so we should just report error, disable
INTx mode and continue the passthrough.
>> > > > > However, the commit 5a11d0f7 regards this as error condition and let
>> > > > > qemu quit passthrough, which is too
rigorous.
>> > > > >
>> > > > > Error message is below:
>> > > > > libxl: error: libxl_qmp.c:287:qmp_handle_error_response: Domain
>> > > > > 2:received an error message from QMP server:
Mapping machine irq 0 to pirq -1 failed: Operation not permitted
>> > > >
>> > > > I'm having issues figuring out what's happening here.
>> > > > s->real_device.irq is 0, yet the PCI config space read of
>> > > > PCI_INTERRUPT_PIN returns something different than 0.
>> > > >
>> > > > AFAICT this is due to some kind of error in Linux, so that even when
>> > > > the device is supposed to have a valid IRQ the sysfs node it is set to
>> > > > 0, do you know the actual underlying cause of this?
>> > > >
>> > > > Thanks, Roger.
>> > > Hi Roger
>> > > Sorry for the later reply, I just missed this mail...
>> > > On my side, it's because the hardware actually does not support INTx
>> > > mode,
>> > > but its configuration space does not report PCI_INTERRUPT_PIN to 0. It's
>> > > a
>> > > hardware bug, but previous version of qemu can tolerate it, switch to MSI
>> > > and make passthrough work.
>> >
>> > Then I think it would be better to check both PCI_INTERRUPT_PIN and
>> > s->real_device.irq before attempting to map the IRQ.
>> >
>> > Making the error non-fatal would mean that a device with a valid IRQ
>> > could fail to be setup correctly but the guest will still be created,
>> > and things won't go as expected when the guest attempts to use it.
>> >
>> > Thanks, Roger.
>> hi roger
>> thanks for your sugguestion. it's right that "s->real_device.irq" is needed
>> to be checked before mapping, like if
it's 0.
>> but on the other hand, maybe xc_physdev_map_pirq() itself can serve as a
>> checking of "s->real_device.irq" ?
>> like in our case, it will fail and return -EPERM.
>> then error hanling is still conducted ==>set INTX_DISABLE flag, eventhrough
>> the error is not fatal.
>>
>> machine_irq = s->real_device.irq;
>> rc = xc_physdev_map_pirq(xen_xc, xen_domid, machine_irq, &pirq);
>> if (rc < 0) {
>> error_setg_errno(errp, errno, "Mapping machine irq %u to"
>> " pirq %i failed", machine_irq, pirq);
>>
>> /* Disable PCI intx assertion (turn on bit10 of devctl) */
>> cmd |= PCI_COMMAND_INTX_DISABLE;
>> machine_irq = 0;
>> s->machine_irq = 0;
>> So, do you think it's all right just converting fatal error to non-fatal?
>
> As I said above, I think it would be better to leave the error as
> fatal and avoid attempting a xc_physdev_map_pirq with a machine_irq ==
> 0, which will fail.
>
> If we really want to go down the route of making the error non-fatal,
> I think you will also have to report PCI_INTERRUPT_PIN as 0 to the
> guest, so that it's clear to the guest that the device doesn't have
> legacy interrupt support.
>
> Exposing a device with PCI_INTERRUPT_PIN != 0 but then not allowing
> the guest to clear PCI_COMMAND_INTX_DISABLE is likely bogus.
>
> Thanks, Roger.