qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [RFC 0/2] vhost-vfio: introduce mdev based HW vhost bac


From: Liang, Cunming
Subject: Re: [Qemu-devel] [RFC 0/2] vhost-vfio: introduce mdev based HW vhost backend
Date: Wed, 7 Nov 2018 12:26:47 +0000


> -----Original Message-----
> From: Jason Wang [mailto:address@hidden
> Sent: Tuesday, November 6, 2018 4:18 AM
> To: Wang, Xiao W <address@hidden>; address@hidden;
> address@hidden
> Cc: address@hidden; Bie, Tiwei <address@hidden>; Liang, Cunming
> <address@hidden>; Ye, Xiaolong <address@hidden>; Wang, Zhihong
> <address@hidden>; Daly, Dan <address@hidden>
> Subject: Re: [RFC 0/2] vhost-vfio: introduce mdev based HW vhost backend
> 
> 
> On 2018/10/16 下午9:23, Xiao Wang wrote:
> > What's this
> > ===========
> > Following the patch (vhost: introduce mdev based hardware vhost
> > backend) https://lwn.net/Articles/750770/, which defines a generic
> > mdev device for vhost data path acceleration (aliased as vDPA mdev
> > below), this patch set introduces a new net client type: vhost-vfio.
> 
> 
> Thanks a lot for a such interesting series. Some generic questions:
> 
> 
> If we consider to use software backend (e.g vhost-kernel or a rely of 
> virito-vhost-
> user or other cases) as well in the future, maybe vhost-mdev is better which 
> mean it
> does not tie to VFIO anyway.
[LC] The initial thought of using term of '-vfio' due to the VFIO UAPI being 
used as interface, which is the only available mdev bus driver. It causes to 
use the term of 'vhost-vfio' in qemu, while using term of 'vhost-mdev' which 
represents a helper in kernel for vhost messages via mdev.

> 
> 
> >
> > Currently we have 2 types of vhost backends in QEMU: vhost kernel
> > (tap) and vhost-user (e.g. DPDK vhost), in order to have a kernel
> > space HW vhost acceleration framework, the vDPA mdev device works as a
> > generic configuring channel.
> 
> 
> Does "generic" configuring channel means dpdk will also go for this way?
> E.g it will have a vhost mdev pmd?
[LC] We don't plan to have a vhost-mdev pmd, but thinking to have consistent 
virtio PMD running on top of vhost-mdev.  Virtio PMD supports pci bus and vdev 
(by virtio-user) bus today. Vhost-mdev most likely would be introduced as 
another bus (mdev bus) provider. mdev bus DPDK support is in backlog.

> 
> 
> >   It exposes to user space a non-vendor-specific configuration
> > interface for setting up a vhost HW accelerator,
> 
> 
> Or even a software translation layer on top of exist hardware.
> 
> 
> > based on this, this patch
> > set introduces a third vhost backend called vhost-vfio.
> >
> > How does it work
> > ================
> > The vDPA mdev defines 2 BAR regions, BAR0 and BAR1. BAR0 is the main
> > device interface, vhost messages can be written to or read from this
> > region following below format. All the regular vhost messages about
> > vring addr, negotiated features, etc., are written to this region directly.
> 
> 
> If I understand this correctly, the mdev was not used for passed through to 
> guest
> directly. So what's the reason of inventing a PCI like device here? I'm 
> asking since:
[LC] mdev uses mandatory attribute of 'device_api' to identify the layout. We 
pick up one available from pci, platform, amba and ccw. It works if defining a 
new one for this transport.

> 
> - vhost protocol is transport indepedent, we should consider to support 
> transport
> other than PCI. I know we can even do it with the exist design but it looks 
> rather odd
> if we do e.g ccw device with a PCI like mediated device.
> 
> - can we try to reuse vhost-kernel ioctl? Less API means less bugs and code 
> reusing.
> E.g virtio-user can benefit from the vhost kernel ioctl API almost with no 
> changes I
> believe.
[LC] Agreed, so it reuses CMD defined by vhost-kernel ioctl. But VFIO provides 
device specific things (e.g. DMAR, INTR and etc.) which is the extra APIs being 
introduced by this transport.

> 
> 
> >
> > struct vhost_vfio_op {
> >     __u64 request;
> >     __u32 flags;
> >     /* Flag values: */
> > #define VHOST_VFIO_NEED_REPLY 0x1 /* Whether need reply */
> >     __u32 size;
> >     union {
> >             __u64 u64;
> >             struct vhost_vring_state state;
> >             struct vhost_vring_addr addr;
> >             struct vhost_memory memory;
> >     } payload;
> > };
> >
> > BAR1 is defined to be a region of doorbells, QEMU can use this region
> > as host notifier for virtio. To optimize virtio notify, vhost-vfio
> > trys to mmap the corresponding page on BAR1 for each queue and
> > leverage EPT to let guest virtio driver kick vDPA device doorbell
> > directly. For virtio 0.95 case in which we cannot set host notifier
> > memory region, QEMU will help to relay the notify to vDPA device.
> >
> > Note: EPT mapping requires each queue's notify address locates at the
> > beginning of a separate page, parameter "page-per-vq=on" could help.
> 
> 
> I think qemu should prepare a fallback for this if page-per-vq is off.
[LC] Yeah, qemu does that and fallback to a syscall to vhost-mdev in kernel.

> 
> 
> >
> > For interrupt setting, vDPA mdev device leverages existing VFIO API to
> > enable interrupt config in user space. In this way, KVM's irqfd for
> > virtio can be set to mdev device by QEMU using ioctl().
> >
> > vhost-vfio net client will set up a vDPA mdev device which is
> > specified by a "sysfsdev" parameter, during the net client init, the
> > device will be opened and parsed using VFIO API, the VFIO device fd
> > and device BAR region offset will be kept in a VhostVFIO structure,
> > this initialization provides a channel to configure vhost information to 
> > the vDPA
> device driver.
> >
> > To do later
> > ===========
> > 1. The net client initialization uses raw VFIO API to open vDPA mdev
> > device, it's better to provide a set of helpers in hw/vfio/common.c to
> > help vhost-vfio initialize device easily.
> >
> > 2. For device DMA mapping, QEMU passes memory region info to mdev
> > device and let kernel parent device driver program IOMMU. This is a
> > temporary implementation, for future when IOMMU driver supports mdev
> > bus, we can use VFIO API to program IOMMU directly for parent device.
> > Refer to the patch (vfio/mdev: IOMMU aware mediated device):
> > https://lkml.org/lkml/2018/10/12/225
> 
> 
> As Steve mentioned in the KVM forum. It's better to have at least one sample 
> driver
> e.g virtio-net itself.
> 
> Then it would be more convenient for the reviewer to evaluate the whole stack.
> 
> Thanks
> 
> 
> >
> > Vhost-vfio usage
> > ================
> > # Query the number of available mdev instances $ cat
> > /sys/class/mdev_bus/0000:84:00.3/mdev_supported_types/ifcvf_vdpa-vdpa_
> > virtio/available_instances
> >
> > # Create a mdev instance
> > $ echo $UUID >
> > /sys/class/mdev_bus/0000:84:00.3/mdev_supported_types/ifcvf_vdpa-vdpa_
> > virtio/create
> >
> > # Launch QEMU with a virtio-net device
> >      qemu-system-x86_64 -cpu host -enable-kvm \
> >      <snip>
> >      -mem-prealloc \
> >      -netdev type=vhost-vfio,sysfsdev=/sys/bus/mdev/devices/$UUID,id=mynet\
> >      -device virtio-net-pci,netdv=mynet,page-per-vq=on \
> >
> > -------- END --------
> >
> > Xiao Wang (2):
> >    vhost-vfio: introduce vhost-vfio net client
> >    vhost-vfio: implement vhost-vfio backend
> >
> >   hw/net/vhost_net.c                |  56 ++++-
> >   hw/vfio/common.c                  |   3 +-
> >   hw/virtio/Makefile.objs           |   2 +-
> >   hw/virtio/vhost-backend.c         |   3 +
> >   hw/virtio/vhost-vfio.c            | 501
> ++++++++++++++++++++++++++++++++++++++
> >   hw/virtio/vhost.c                 |  15 ++
> >   include/hw/virtio/vhost-backend.h |   7 +-
> >   include/hw/virtio/vhost-vfio.h    |  35 +++
> >   include/hw/virtio/vhost.h         |   2 +
> >   include/net/vhost-vfio.h          |  17 ++
> >   linux-headers/linux/vhost.h       |   9 +
> >   net/Makefile.objs                 |   1 +
> >   net/clients.h                     |   3 +
> >   net/net.c                         |   1 +
> >   net/vhost-vfio.c                  | 327 +++++++++++++++++++++++++
> >   qapi/net.json                     |  22 +-
> >   16 files changed, 996 insertions(+), 8 deletions(-)
> >   create mode 100644 hw/virtio/vhost-vfio.c
> >   create mode 100644 include/hw/virtio/vhost-vfio.h
> >   create mode 100644 include/net/vhost-vfio.h
> >   create mode 100644 net/vhost-vfio.c
> >

reply via email to

[Prev in Thread] Current Thread [Next in Thread]