qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: [PATCH v2 0/5] virtio mmio specification enhancement


From: Pincus, Josh
Subject: RE: [PATCH v2 0/5] virtio mmio specification enhancement
Date: Mon, 3 Aug 2020 23:31:17 +0000

Hi Alex,

Thank you for the reply.

Please see my inline response below.

-----Original Message-----
From: Alex Bennée <alex.bennee@linaro.org> 
Sent: Friday, July 31, 2020 8:45 AM
To: Pincus, Josh <Josh.Pincus@windriver.com>
Cc: linux-kernel@vger.kernel.org; zhabin@linux.alibaba.com; 
virtio-dev@lists.oasis-open.org; qemu-devel@nongnu.org
Subject: Re: [PATCH v2 0/5] virtio mmio specification enhancement


Pincus, Josh <Josh.Pincus@windriver.com> writes:

> Hi,
>
>  
>
> We were looking into a similar enhancement for the Virt I/O MMIO transport 
> and came across this project.
>
> This enhancement would be perfect for us.

So there is certainly an interest in optimising MMIO based virtio and the 
current read/ack cycle adds additional round trip time for any trap and emulate 
hypervisor. However I think there is some resistance to making MMIO a 
re-implementation of what PCI already gives us for "free".

I believe the current questions that need to be addressed are:

  - Clear definitions in the spec on doorbells/notifications

    The current virtio spec uses different terms in some places so it
    would be nice to clarify the language and formalise what the
    standard expects from transports w.r.t the capabilities of
    notifications and doorbells.

[JP] The read/ack cycle not only adds to the round-trip time for any trap and 
emulate HYP, but it also precludes an environment where one might want to avoid 
emulation completely.  We're interested in using the MMIO transport combined 
with an augmented device node in the DTB to have device features, reserved 
memory for queues, and specific MSI interrupts per queue conveyed to the guest 
statically.  In this kind of restricted environment, negotiation for features 
might be completely disabled; you see what the device node describes and you 
either support those features or not.  Likewise, the standard list of state 
machine transitions for communicating driver and device state would be skipped. 
 A driver in a guest comes up, reads the device node info, uses the queues as 
described, and assigns the MSI vectors per queue and config-has-changed 
service.  When an interrupt comes in, there's no need to ack it beyond the 
normal way in which one conveys an EOI to hardware.  It also means that with 
one dedicated interrupt per queue we won't have to select the queue in question 
and test which one got updated.  In short, we are experimenting with getting 
rid of the emulation if we can.

  - Quantifying the memory foot-print difference between PCI/MMIO

    PCI gives a lot for free including a discovery and IRQ model already
    designed to handle MSI/MSI-X. There is a claim that this brings in a
    lot of bloat but I think there was some debate around the numbers.
    My rough initial experiment with a PCI and non-PCI build with
    otherwise identical VIRTIO configs results in the following:

    16:40:15 c.282% [alex@zen:~/l/l/builds] review/rpmb|… + ls -l arm64/vmlinux 
arm64.nopci/vmlinux
    -rwxr-xr-x 1 alex alex 83914728 Jul 31 16:39 arm64.nopci/vmlinux*
    -rwxr-xr-x 1 alex alex 86368080 Jul 31 16:33 arm64/vmlinux*

    which certainly implies there could be a fair amount of headroom for
    an MMIO version to implement some features. However I don't know if
    it's fully apples to apples as there maybe unneeded PCI bloat that a
    virtio-only kernel doesn't need.

[JP] Apropos of your subsequent email on this topic, the PCI bloat isn't 
terrible.  The major stumbling block in our case is that we would like to see 
if there's a restricted model in which the emulation can be removed completely. 
 Case in point: Virt I/O RPMsgs in OpenAMP only use the queues to transfer data 
back and forth.  (Unless I'm mistaken?)   We'd like to see if that model can be 
a bit more generalized so that other kinds of drivers can be constructed that 
similarly don't rely on emulation for handling interrupt read/ack, feature 
negotiation, queue selection, etc.  Memory is mapped into the guest for queues 
and R/O device registers, interrupts are assigned in the DTB for each queue, 
and features are, essentially, non-negotiable.  

What are the features you are most interested in?

[JP] See above. 😉 The restricted environment in question is for very simple 
applications that don't have any kind of PCI infrastructure and for virtual 
environments with no HYP or a very restricted HYP.  

> Has there been any progress since Feb, 2020?  It looks like the effort 
> might have stalled?

I can't speak to the OP's but there is certainly interest from others that are 
not the original posters.

[JP] Maybe we can restart the thread/discussion and see where it goes from here.

--
Alex Bennée

reply via email to

[Prev in Thread] Current Thread [Next in Thread]