[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] Questions about VFIO enabling MSIX vector

From: Heyi Guo
Subject: Re: [Qemu-devel] Questions about VFIO enabling MSIX vector
Date: Tue, 15 Jan 2019 11:21:51 +0800
User-agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:45.0) Gecko/20100101 Thunderbird/45.7.1

Hi Alex,

Really appreciate your comments. I have some more questions below.

On 2019/1/15 0:07, Alex Williamson wrote:
On Sat, 12 Jan 2019 10:30:40 +0800
Heyi Guo <address@hidden> wrote:

Hi folks,

I have some questions about vfio_msix_vector_do_use() in
hw/vfio/pci.c, could you help to explain?

We can see that when guest tries to enable one specific MSIX vector
by unmasking MSIX Vector Control, the access will be trapped and then
into function vfio_msix_vector_do_use(). And we may go to the below
branch in line 525:

520 /*
521  * We don't want to have the host allocate all possible MSI vectors
522  * for a device if they're not in use, so we shutdown and incrementally
523  * increase them as needed.
524  */
525  if (vdev->nr_vectors < nr + 1) {
526      vfio_disable_irqindex(&vdev->vbasedev, VFIO_PCI_MSIX_IRQ_INDEX);
527      vdev->nr_vectors = nr + 1;
528      ret = vfio_enable_vectors(vdev, true);
529      if (ret) {
530          error_report("vfio: failed to enable vectors, %d", ret);
531      }

Here all MSIX vectors will be disabled first and then enabled, with
one more MSIX. The comment is there but I still don't quite
understand. It makes sense for not allocating all possible MSI
vectors, but why shall we shutdown the whole MSI when being requested
to enable one specific vector? Can't we just enable the user
specified vector indexed by "nr"?
What internal kernel API would vfio-pci make use of to achieve that?
We can't know the intentions of the guest and we have a limited set of
tools to manage the MSI-X state in the host kernel.  It's generally the
case that MSI-X vectors are only manipulated during driver setup and
shutdown, so while the current behavior isn't necessarily efficient, it
works within the constraints and generally doesn't have a runtime impact
on performance.  We also recognize that this behavior may change in the
future depending on more dynamic internal host kernel interfaces, thus
we export the VFIO_IRQ_INFO_NORESIZE flag to indicate to userspace
whether this procedure is required.

I tried to enable the specified MSIX vector only, and finally understood the 
original QEMU code after always getting EINVAL from ioctl->VFIO_DEVICE_SET_IRQS.
Then I tried to allocate all MSIX vectors in vfio_msix_enable(), not only MSIX vector 0. 
The change works, but it definitely doesn't follow the comment in the code "We don't 
want to have the host allocate all possible MSI vectors for a device if they're not in 
use". May I ask what the side effect is if we really allocate all possible vectors 
at the beginning?

What's more, on ARM64 systems with GIC ITS, the kernel will issue an
ITS discard command when disabling a MSI vector, which will drop
currently pending MSI interrupt. If device driver in guest system
enables some MSIs first and interrupts may come at any time, and then
it tries to enable other MSIs, is it possible for the above code to
cause interrupts missing?
Interrupt reconfiguration is generally during driver setup or teardown
when the device is considered idle or lost interrupts are not a
concern.  If you have a device which manipulates interrupt
configuration runtime, you can expect that lost interrupts won't be the
only problem,
On physical machine, if the driver enables vector 0 first, and then vector 1 
later, will the vector 0 interrupt be lost for some possibility? In my opinion 
the manipulation of vector 1 should not affect the interrupt of vector 0, isn't 
it? But if this is possible in virtual world, what do we consider it as? A bug 
to be fixed in future, or a known limitation of virtualization that won't be 
fixed and all driver developers should pay attention to it?



it's also likely to experience much more overhead in the
interrupt manipulation path under assignment than on bare metal.  This
is another reason that well behaved drivers generally use relatively
static interrupt configurations initialized at driver setup time.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]