[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [RFC PATCH] s390x/pci: vfio-pci breakage with disabled mem enforceme
Re: [RFC PATCH] s390x/pci: vfio-pci breakage with disabled mem enforcement
Tue, 28 Jul 2020 16:13:57 +0200
Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.9.0
On 7/28/20 2:52 PM, Alex Williamson wrote:
> On Tue, 28 Jul 2020 11:33:55 +0200
> Niklas Schnelle <firstname.lastname@example.org> wrote:
>> On 7/27/20 6:47 PM, Alex Williamson wrote:
>>> On Mon, 27 Jul 2020 17:40:39 +0200
>>> Pierre Morel <email@example.com> wrote:
>>>> On 2020-07-23 18:29, Alex Williamson wrote:
>>>>> On Thu, 23 Jul 2020 11:13:55 -0400
>>>>> Matthew Rosato <firstname.lastname@example.org> wrote:
>>>>>> I noticed that after kernel commit abafbc55 'vfio-pci: Invalidate mmaps
>>>>>> and block MMIO access on disabled memory' vfio-pci via qemu on s390x
>>>>>> fails spectacularly, with errors in qemu like:
>> ... snip ...
>>>> Alex, Matt,
>>>> in s390 we have the possibility to assign a virtual function to a
>>>> logical partition as function 0.
>>>> In this case it can not be treated as a virtual function but must be
>>>> treated as a physical function.
>>>> This is currently working very well.
>>>> However, these functions do not set PCI_COMMAND_MEMORY as we need.
>>> Where is the vendor and device ID virtualization done for these
>>> devices, we can't have a PF with IDs 0000:0000.
>> Pierre doesn't mean the Device/Vendor IDs he means it has devfn == 0
>> so it is the mandatory function zero on it's PCI bus, where until recently
>> we always had only one function per bus but with the recent multi-function
>> support it can act more like on other platforms with several PCI functions
>> sharing the same Bus e.g. a PF and the VFs created through sriov_numvfs.
>> That's why I'm saying that having devfn == 0 should not be very special for
>> a VF
>> passed to a VM and I really don't see where it would not act like a VF passed
>> from any other Hypervisor.
> My question is relative to other registers on VFs that are not
> implemented in hardware, not the device address. In addition to the
> memory bit of the command register, SR-IOV VFs do not implement the
> vendor and device IDs, these are to be virtualized from the values
> provided in the PF SR-IOV capability. It wouldn't make much sense to
> expose a device with no PCI vendor or device ID, so I assume the z/VM
> hypervisor is virtualizing these, but not the memory bit.
Ahh, yes I see. On Z these are actually already virtualized at the LPAR
level as part of the firmware based scanning I mentioned that is the
reason for pdev->no_vf_scan. This is true even for VFs created
through sriov_numvfs in the host which is why I did not realize these
don't come from hardware, even though looking at drivers/pci/iov.c it's
pretty obvious and I did touch that code before. Sorry for the confusion.
>> The only really tricky part in my opinion is where during the "probing"
>> we do set is_virtfn so it happens both for VFs passed-through from z/VM
>> or LPAR and VFs created through sriov_numvfs which unlike on other platforms
>> are also scanned by Firmware (pdev->no_vf_scan disables the Linux side
>> With the fix I'm currently testing I had to do this in
>> because I also create sysfs links between VFs and their parent PFs and those
>> need the sysfs entries to be already created, which makes the more
>> sound pcibios_bus_add_device() too early.
> I don't think it would be wise to set is_virtfn without a physfn being
> present in the OS view. I believe pci_dev.is_virtfn implies
> pci_dev.physfn points to the PF device. Thanks>
Thank you a lot for that hint, you're absolutely right, while the
drivers do work with is_virtfn == 1 && physffn == NULL
vfio-pci gets very confused. And sorry Pierre for doubting
your is_virtfn_detached idea, this thing really is confusing.
>>>> Shouldn't we fix this inside the kernel, to keep older QMEU working?
>>>> Then would it be OK to add a new bit/boolean inside the
>>>> pci_dev/vfio_pci_device like, is_detached_vfn, that we could set during
>>>> enumeration and test inside __vfio_pci_memory_enabled() to return true?
>>> Probably each instance of is_virtfn in vfio-pci should be looked at to
>>> see if it applies to s390. If we're going to recognize this as a VF,
>>> I'd rather we complete the emulation that the lower level hypervisor
>>> has missed. If we can enable all the is_virtfn code on s390, then we
>>> should probably cache is_virtfn on the vfio_pci_device object and allow
>>> s390 a place to set it once at probe or enable time.
>>>> In the enumeration we have the possibility to know if the function is a
>>>> HW/Firmware virtual function on devfn 0 or if it is created by SRIOV.
>>>> It seems an easy fix without side effects.
>>>> What do you think?
>>> It sure seems preferable to recognize that it is a VF in the kernel
>>> than to require userspace to have arch specific hacks. Thanks,
Re: [RFC PATCH] s390x/pci: vfio-pci breakage with disabled mem enforcement, Niklas Schnelle, 2020/07/24