Re: [Qemu-devel] [RFC PATCH] PCI: Introduce INTx check & mask API

qemu-devel

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [RFC PATCH] PCI: Introduce INTx check & mask API

From:	Jan Kiszka
Subject:	Re: [Qemu-devel] [RFC PATCH] PCI: Introduce INTx check & mask API
Date:	Fri, 25 May 2012 07:43:04 -0300
User-agent:	Mozilla/5.0 (X11; U; Linux i686 (x86_64); de; rv:1.8.1.12) Gecko/20080226 SUSE/2.0.0.12-1.1 Thunderbird/2.0.0.12 Mnenhy/0.7.5.666

On 2012-05-24 23:47, Alexey Kardashevskiy wrote:
> On 25/05/12 12:29, Jan Kiszka wrote:
>> On 2012-05-24 22:18, Alexey Kardashevskiy wrote:
>>> On 24/05/12 22:02, Jan Kiszka wrote:
>>>> On 2012-05-24 04:44, Alexey Kardashevskiy wrote:
>>>>> [Found while debugging VFIO on POWER but it is platform independent]
>>>>>
>>>>> There is a feature in PCI (>=2.3?) to mask/unmask INTx via PCI_COMMAND and
>>>>> PCI_STATUS registers.
>>>>
>>>> Yes, 2.3 introduced this. Masking is done via command register, checking
>>>> if the source was the PCI in question via the status register. The
>>>> latter is important for supporting IRQ sharing - and that's why we
>>>> introduced this masking API to the PCI layer.
>>>
>>>
>>> Is not it just a quite small optimization to not to disable interrupts on 
>>> all devices which share
>>> the same IRQ but just on those who fired an interrupt? If so, do PCI 
>>> devices really often share
>>> IRQs? Does not supporting this mean real slowdown on such devices?
>>>
>>> As far as I understand, everyone who cares about performance uses MSI/MSIX, 
>>> no?
>>
>> Not everyone is blessed with MSI-only PCI devices. From my notebook:
>>
>> # cat /proc/interrupts
>> [...]
>>  22: [...] IO-APIC-fasteoi   ehci_hcd:usb1, ehci_hcd:usb2
>>
>> So, if I want to assign one EHCI controller to a guest, I have to
>> disable the other as well. The same can happen quickly if you attach a
>> few legacy PCI adapters to a system and want to pass them through.
> 
> Why? vfio-pci receives interrupt, disables it, handles it, enables interrupt 
> back. Yes, handling is
> a bit longer and includes passing interrupt to QEMU and then to the guest 
> (can be optimized to avoid
> QEMU) and waiting for EOI notification but this is all the difference.

You can disable the complete IRQ line as you cannot predict when the
untrusted device driver that VFIO, KVM, or UIO serves will finally
decide to silence the IRQ reason in hardware. If you did this, you risk
a DoS attack on those other devices.

> 
> Does the current kernel use INTx bit for your USB controllers now, without 
> any KVM, etc?

No, it is only used for KVM device assignment when it grabs a device and
uio_pci_generic. If a host driver uses the device, and can silence
interrupts in a device-specific way.

> 
> So, is it just an optimization or it is something bigger that I missed?

It is not an optimization but an essential feature to support INTx
sharing between an untrusted device driver and some other driver.

> 
> 
>>>>> And there is some API to support that (commit 
>>>>> a2e27787f893621c5a6b865acf6b7766f8671328).
>>>>>
>>>>> I have a network adapter:
>>>>> 0001:00:01.0 Ethernet controller: Chelsio Communications Inc T310 10GbE 
>>>>> Single Port Adapter
>>>>>   Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr+ 
>>>>> Stepping- SERR+ FastB2B- DisINTx-
>>>>>   Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- 
>>>>> <MAbort- >SERR- <PERR- INTx-
>>>>>
>>>>> pci_intx_mask_supported() reports that the feature is supported for this 
>>>>> adapter
>>>>> BUT the adapter does not set PCI_STATUS_INTERRUPT so 
>>>>> pci_check_and_set_intx_mask()
>>>>> never changes PCI_COMMAND and INTx does not work on it when we use it as 
>>>>> VFIO-PCI device.
>>>>>
>>>>> If I remove the check of this bit, it works fine as it is called from an 
>>>>> interrupt handler and
>>>>> Status bit check is redundant.
>>>>>
>>>>> Opened a spec:
>>>>> PCI LOCAL BUS SPECIFICATION, REV. 3.0, Table 6-2: Status Register Bits
>>>>> ===
>>>>> 3 This read-only bit reflects the state of the interrupt in the
>>>>> device/function. Only when the Interrupt Disable bit in the command
>>>>> register is a 0 and this Interrupt Status bit is a 1, will the
>>>>> device’s/function’s INTx# signal be asserted. Setting the Interrupt
>>>>>    Disable bit to a 1 has no effect on the state of this bit.
>>>>> ===
>>>>> With this adapter, INTx# is asserted but Status bit is still 0.
>>>>>
>>>>> Is it mandatory for a device to set Status bit if it supports INTx 
>>>>> masking?
>>>>>
>>>>> 2 Alex: if it is mandatory, then we need to be able to disable pci_2_3 in 
>>>>> VFIO-PCI
>>>>> somehow.
>>>>
>>>> Since PCI 2.3, this bit is mandatory, and it should be independent of
>>>> the masking bit. The question is, if your device is supposed to support
>>>> 2.3, thus is just buggy, or if our detection algorithm is unreliable. It
>>>> basically builds on the assumption that, if we can flip the mask bit,
>>>> the feature should be present. I guess that is the best we can do. Maybe
>>>> we can augment this with a blacklist of devices that "support" flipping
>>>> without actually providing the feature.
>>>
>>> It is a good moment to start :)
>>> Not sure where - in VFIO or along with that PCI INTx API.
>>
>> At PCI level as the API is VFIO agnostic (it was introduced for
>> "classic" KVM device assignment, in fact).
>>> Here is that broken device:
>>> address@hidden:~$ lspci -s 1:1:0.0
>>> 0001:01:00.0 Ethernet controller: Chelsio Communications Inc T310 10GbE 
>>> Single Port Adapter
>>> address@hidden:~$ lspci -ns 1:1:0.0
>>> 0001:01:00.0 0200: 1425:0030
>>
>> A patch to add the infrastructure as well would be even more welcome. :)
>> You could have a look at drivers/pci/quirks.c for patterns how to do this.
> 
> I am not sure yet that we need this feature at all ;) I would rather prefer 
> to have some way to
> disable it in VFIO rather than to add yet another quirk for the feature which 
> nobody uses at the moment.
> Really, this device supports MSI/MSIX and in real life nobody is going to use 
> INTx on it. The only
> need for it is testing.

These are wrong assumptions, both that it has to be addressed at VFIO
level and that it has no serious use case. We will need this feature for
quite a while until legacy PCI finally died. Bets are taken when this
will happen, but I would be careful with any date in this decade. ;)

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux

[Prev in Thread]

Current Thread

[Next in Thread]

[Qemu-devel] [RFC PATCH] PCI: Introduce INTx check & mask API, Alexey Kardashevskiy, 2012/05/24
- Re: [Qemu-devel] [RFC PATCH] PCI: Introduce INTx check & mask API, Jan Kiszka, 2012/05/24
  - Re: [Qemu-devel] [RFC PATCH] PCI: Introduce INTx check & mask API, Alex Williamson, 2012/05/24
    - Re: [Qemu-devel] [RFC PATCH] PCI: Introduce INTx check & mask API, Alexey Kardashevskiy, 2012/05/24
  - Re: [Qemu-devel] [RFC PATCH] PCI: Introduce INTx check & mask API, Alexey Kardashevskiy, 2012/05/24
    - Re: [Qemu-devel] [RFC PATCH] PCI: Introduce INTx check & mask API, Jan Kiszka, 2012/05/24
    - Re: [Qemu-devel] [RFC PATCH] PCI: Introduce INTx check & mask API, Alexey Kardashevskiy, 2012/05/24
    - Re: [Qemu-devel] [RFC PATCH] PCI: Introduce INTx check & mask API, Jan Kiszka <=
    - Re: [Qemu-devel] [RFC PATCH] PCI: Introduce INTx check & mask API, Alexey Kardashevskiy, 2012/05/25
    - Re: [Qemu-devel] [RFC PATCH] PCI: Introduce INTx check & mask API, Jan Kiszka, 2012/05/25

Prev by Date: [Qemu-devel] [PATCH] ISCSI: redo how we set up the events to only call qemu_aio_set_fd_handler() and qemu_notify_event() if something has changed.
Next by Date: [Qemu-devel] [PATCH v4 16/16] hub: add the support for hub own flow control
Previous by thread: Re: [Qemu-devel] [RFC PATCH] PCI: Introduce INTx check & mask API
Next by thread: Re: [Qemu-devel] [RFC PATCH] PCI: Introduce INTx check & mask API
Index(es):
- Date
- Thread