qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: RFC: use VFIO over a UNIX domain socket to implement device offloadi


From: Marc-André Lureau
Subject: Re: RFC: use VFIO over a UNIX domain socket to implement device offloading
Date: Wed, 1 Apr 2020 18:58:20 +0200

Hi

On Wed, Apr 1, 2020 at 5:51 PM Thanos Makatos
<address@hidden> wrote:
>
> > On Thu, Mar 26, 2020 at 09:47:38AM +0000, Thanos Makatos wrote:
> > > Build MUSER with vfio-over-socket:
> > >
> > >         git clone --single-branch --branch vfio-over-socket
> > address@hidden:tmakatos/muser.git
> > >         cd muser/
> > >         git submodule update --init
> > >         make
> > >
> > > Run device emulation, e.g.
> > >
> > >         ./build/dbg/samples/gpio-pci-idio-16 -s <N>
> > >
> > > Where <N> is an available IOMMU group, essentially the device ID, which
> > must not
> > > previously exist in /dev/vfio/.
> > >
> > > Run QEMU using the vfio wrapper library and specifying the MUSER device:
> > >
> > >         LD_PRELOAD=muser/build/dbg/libvfio/libvfio.so qemu-system-x86_64
> > \
> > >                 ... \
> > >                 -device vfio-pci,sysfsdev=/dev/vfio/<N> \
> > >                 -object memory-backend-file,id=ram-node0,prealloc=yes,mem-
> > path=mem,share=yes,size=1073741824 \
> > >                 -numa node,nodeid=0,cpus=0,memdev=ram-node0
> > >

fyi, with 5.0 you no longer need -numa!:

-object memory-backend-memfd,id=mem,size=2G -M memory-backend=mem

(hopefully, we will get something even simpler one day)

> > > Bear in mind that since this is just a PoC lots of things can break, e.g. 
> > > some
> > > system call not intercepted etc.
> >
> > Cool, I had a quick look at libvfio and how the transport integrates
> > into libmuser.  The integration on the libmuser side is nice and small.
> >
> > It seems likely that there will be several different implementations of
> > the vfio-over-socket device side (server):
> > 1. libmuser
> > 2. A Rust equivalent to libmuser
> > 3. Maybe a native QEMU implementation for multi-process QEMU (I think JJ
> >    has been investigating this?)
> >
> > In order to interoperate we'll need to maintain a protocol
> > specification.  Mayb You and JJ could put that together and CC the vfio,
> > rust-vmm, and QEMU communities for discussion?
>
> Sure, I can start by drafting a design doc and share it.

ok! I am quite amazed you went this far with a ldpreload hack. This
demonstrates some limits of gpl projects, if it was necessary.

I think with this work, and the muser experience, you have a pretty
good idea of what the protocol could look like. My approach, as I
remember, was a quite straightforward VFIO over socket translation,
while trying to see if it could share some aspects with vhost-user,
such as memory handling etc.

To contrast with the work done on qemu-mp series, I'd also prefer we
focus our work on a vfio-like protocol, before trying to see how qemu
code and interface could be changed over multiple binaries etc. We
will start with some limitations, similar to the one that apply to
VFIO: migration, introspection, managements etc are mostly left out
for now. (iow, qemu-mp is trying to do too many things simultaneously)

That's the rough ideas/plan I have in mind:
- draft/define a "vfio over unix" protocol
- similar to vhost-user, also define some backend conventions
https://github.com/qemu/qemu/blob/master/docs/interop/vhost-user.rst#backend-program-conventions
- modify qemu vfio code to allow using a socket backend. Ie something
like "-chardev socket=foo -device vfio-pci,chardev=foo"
- implement some test devices (and outside qemu, in whatever
language/framework - the more the merrier!)
- investigate how existing qemu binary could expose some devices over
"vfio-unix", for ex: "qemu -machine none -chardev socket=foo,server
-device pci-serial,vfio=foo". This would avoid a lot of proxy and code
churn proposed in qemu-mp.
- think about evolution of QMP, so that commands are dispatched to the
right process. In my book, this is called a bus, and I would go for
DBus (not through qemu) in the long term. But for now, we probably
want to split QMP code to make it more modular (in qemu-mp series,
this isn't stellar either). Later on, perhaps look at bridging QMP
over DBus.
- code refactoring in qemu, to allow smaller binaries, that implement
the minimum for vfio-user devices. (imho, this will be a bit easier
after the meson move, as the build system is simpler)

That should allow some work sharing.

I can't wait for your design draft, and see how I could help.

>
> > It should cover the UNIX domain socket connection semantics (does a
> > listen socket only accept 1 connection at a time?  What happens when the
> > client disconnects?  What happens when the server disconnects?), how
> > VFIO structs are exchanged, any vfio-over-socket specific protocol
> > messages, etc.  Basically everything needed to write an implementation
> > (although it's not necessary to copy the VFIO struct definitions from
> > the kernel headers into the spec or even document their semantics if
> > they are identical to kernel VFIO).
> >
> > The next step beyond the LD_PRELOAD library is a native vfio-over-socket
> > client implementation in QEMU.  There is a prototype here:
> > https://github.com/elmarco/qemu/blob/wip/vfio-user/hw/vfio/libvfio-
> > user.c
> >
> > If there are any volunteers for working on that then this would be a
> > good time to discuss it.
> >
> > Finally, has anyone looked at CrosVM's out-of-process device model?  I
> > wonder if it has any features we should consider...
> >
> > Looks like a great start to vfio-over-socket!
>


-- 
Marc-André Lureau



reply via email to

[Prev in Thread] Current Thread [Next in Thread]