qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: RFC: use VFIO over a UNIX domain socket to implement device offloadi


From: Stefan Hajnoczi
Subject: Re: RFC: use VFIO over a UNIX domain socket to implement device offloading
Date: Thu, 2 Apr 2020 11:19:42 +0100

On Wed, Apr 01, 2020 at 06:58:20PM +0200, Marc-André Lureau wrote:
> On Wed, Apr 1, 2020 at 5:51 PM Thanos Makatos
> <address@hidden> wrote:
> > > On Thu, Mar 26, 2020 at 09:47:38AM +0000, Thanos Makatos wrote:
> > > > Build MUSER with vfio-over-socket:
> > > >
> > > >         git clone --single-branch --branch vfio-over-socket
> > > address@hidden:tmakatos/muser.git
> > > >         cd muser/
> > > >         git submodule update --init
> > > >         make
> > > >
> > > > Run device emulation, e.g.
> > > >
> > > >         ./build/dbg/samples/gpio-pci-idio-16 -s <N>
> > > >
> > > > Where <N> is an available IOMMU group, essentially the device ID, which
> > > must not
> > > > previously exist in /dev/vfio/.
> > > >
> > > > Run QEMU using the vfio wrapper library and specifying the MUSER device:
> > > >
> > > >         LD_PRELOAD=muser/build/dbg/libvfio/libvfio.so qemu-system-x86_64
> > > \
> > > >                 ... \
> > > >                 -device vfio-pci,sysfsdev=/dev/vfio/<N> \
> > > >                 -object 
> > > > memory-backend-file,id=ram-node0,prealloc=yes,mem-
> > > path=mem,share=yes,size=1073741824 \
> > > >                 -numa node,nodeid=0,cpus=0,memdev=ram-node0
> > > >
> 
> fyi, with 5.0 you no longer need -numa!:
> 
> -object memory-backend-memfd,id=mem,size=2G -M memory-backend=mem
> 
> (hopefully, we will get something even simpler one day)
> 
> > > > Bear in mind that since this is just a PoC lots of things can break, 
> > > > e.g. some
> > > > system call not intercepted etc.
> > >
> > > Cool, I had a quick look at libvfio and how the transport integrates
> > > into libmuser.  The integration on the libmuser side is nice and small.
> > >
> > > It seems likely that there will be several different implementations of
> > > the vfio-over-socket device side (server):
> > > 1. libmuser
> > > 2. A Rust equivalent to libmuser
> > > 3. Maybe a native QEMU implementation for multi-process QEMU (I think JJ
> > >    has been investigating this?)
> > >
> > > In order to interoperate we'll need to maintain a protocol
> > > specification.  Mayb You and JJ could put that together and CC the vfio,
> > > rust-vmm, and QEMU communities for discussion?
> >
> > Sure, I can start by drafting a design doc and share it.
> 
> ok! I am quite amazed you went this far with a ldpreload hack. This
> demonstrates some limits of gpl projects, if it was necessary.
> 
> I think with this work, and the muser experience, you have a pretty
> good idea of what the protocol could look like. My approach, as I
> remember, was a quite straightforward VFIO over socket translation,
> while trying to see if it could share some aspects with vhost-user,
> such as memory handling etc.
> 
> To contrast with the work done on qemu-mp series, I'd also prefer we
> focus our work on a vfio-like protocol, before trying to see how qemu
> code and interface could be changed over multiple binaries etc. We
> will start with some limitations, similar to the one that apply to
> VFIO: migration, introspection, managements etc are mostly left out
> for now. (iow, qemu-mp is trying to do too many things simultaneously)

qemu-mp has been cut down significantly in order to make it
non-invasive.  The model is now much cleaner:
1. No monitor command or command-line option forwarding.  The device
   emulation program has its own command-line and monitor that QEMU
   doesn't know about.
2. No per-device proxy objects.  A single RemotePCIDevice is added to
   QEMU.  In the current patch series it only supports the LSI SCSI
   controller but once the socket protocol is changed to
   vfio-over-socket it will be possible to use any PCI device.

We recently agreed on dropping live migration to further reduce the
patch series.  If you have specific suggestions, please post reviews on
the latest patch series.

The RemotePCIDevice and device emulation program infrastructure it puts
in place are intended to be used by vfio-over-socket in the future.  I
see it as complementary to vfio-over-socket rather than as a
replacement.  Elena, Jag, and JJ have been working on it for a long time
and I think we should build on top of it (replacing parts as needed)
rather than propose a new plan that sidelines their work.

> That's the rough ideas/plan I have in mind:
> - draft/define a "vfio over unix" protocol
> - similar to vhost-user, also define some backend conventions
> https://github.com/qemu/qemu/blob/master/docs/interop/vhost-user.rst#backend-program-conventions
> - modify qemu vfio code to allow using a socket backend. Ie something
> like "-chardev socket=foo -device vfio-pci,chardev=foo"

I think JJ has been working on this already.  Not sure what the status
is.

> - implement some test devices (and outside qemu, in whatever
> language/framework - the more the merrier!)
> - investigate how existing qemu binary could expose some devices over
> "vfio-unix", for ex: "qemu -machine none -chardev socket=foo,server
> -device pci-serial,vfio=foo". This would avoid a lot of proxy and code
> churn proposed in qemu-mp.

This is similar to the qemu-mp approach.  I think they found that doing
this in practice requires a RemotePCIBus and a
RemoteInterruptController.  Something along these lines:

  qemu -machine none \
       -chardev socket=foo,server \
       -device remote-pci-bus,chardev=foo \
       -device pci-serial # added to the remote-pci-bus

PCI devices you want to instantiate are completely unmodified - no need
to even add a vfio= parameter.  They just happen to be on a RemotePCIBus
instead of a regular PCI bus.  That way they can be accessed via
vfio-over-socket and interrupts are also handled remotely.

Stefan

Attachment: signature.asc
Description: PGP signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]