qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH 3/4] net/virtio: add failover support


From: Alex Williamson
Subject: Re: [Qemu-devel] [PATCH 3/4] net/virtio: add failover support
Date: Fri, 31 May 2019 14:29:33 -0600

On Fri, 31 May 2019 19:45:13 +0100
"Dr. David Alan Gilbert" <address@hidden> wrote:

> * Michael S. Tsirkin (address@hidden) wrote:
> > On Fri, May 31, 2019 at 02:01:54PM -0300, Eduardo Habkost wrote:  
> > > > Yes. It's just lots of extremely low level interfaces
> > > > and all rather pointless.
> > > > 
> > > > And down the road extensions like surprise removal support will make it
> > > > all cleaner and more transparent. Floating things up to libvirt means
> > > > all these low level details will require more and more hacks.  
> > > 
> > > Why do you call it pointless?  
> > 
> > We'd need APIs to manipulate device visibility to guest, hotplug
> > controller state and separately manipulate the resources allocated. This
> > is low level stuff that users really have no idea what to do about.
> > Exposing such a level of detail to management is imho pointless.
> > We are better off with a high level API, see below.  
> 
> so I don't know much about vfio; but to me it strikes me that
> you wouldn't need that low level detail if we just reworked vfio
> to look more like all our other devices;

I don't understand what this means, I thought vfio-pci followed a very
standard device model.

> something like:
> 
>   -vfiodev  host=02:00.0,id=gpu
>   -device vfio-pci,dev=gpu
>
> The 'vfiodev' would own the resources; so to do this trick, the
> management layer would:
>    hotunplug the vfio-pci
>    migrate
> 
> if anything went wrong it would
>    hotplug the vfio-pci backin
> 
> you wouldn't have free'd up any resources because they belonged
> to the vfiodev.

So you're looking more for some sort of frontend-backend separation, we
hot-unplug the frontend device that's exposed to the guest while the
backend device that holds the host resources is still attached.  I
would have hardly guessed that's "like all our other devices".  I was
under the impression (from previous discussions mostly) that the device
removal would be caught before actually allowing the device to finalize
and exit, such that with a failed migration, re-adding the device would
be deterministic since the device is never released back to the host.
I expected that could be done within QEMU, but I guess that's what
we're getting into here is how management tools specify that eject w/o
release semantic.  I don't know what this frontend/backend rework would
look like for vfio-pci, but it seems non-trivial for this one use case
and I don't see that it adds any value outside of this use case,
perhaps quite the opposite, it's an overly complicated interface for
the majority of use cases so we either move to a more complicated
interface or maintain both.  Poor choices either way.  Thanks,

Alex



reply via email to

[Prev in Thread] Current Thread [Next in Thread]