qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [RFC v3] VFIO Migration


From: Cornelia Huck
Subject: Re: [RFC v3] VFIO Migration
Date: Mon, 16 Nov 2020 14:52:26 +0100

On Mon, 16 Nov 2020 11:02:51 +0000
Stefan Hajnoczi <stefanha@redhat.com> wrote:

> On Wed, Nov 11, 2020 at 04:35:43PM +0100, Cornelia Huck wrote:
> > On Wed, 11 Nov 2020 15:14:49 +0000
> > Stefan Hajnoczi <stefanha@redhat.com> wrote:
> >   
> > > On Wed, Nov 11, 2020 at 12:48:53PM +0100, Cornelia Huck wrote:  
> > > > On Tue, 10 Nov 2020 13:14:04 -0700
> > > > Alex Williamson <alex.williamson@redhat.com> wrote:    
> > > > > On Tue, 10 Nov 2020 09:53:49 +0000
> > > > > Stefan Hajnoczi <stefanha@redhat.com> wrote:    
> > > >     
> > > > > > Device models supported by an mdev driver and their details can be 
> > > > > > read from
> > > > > > the migration_info.json attr. Each mdev type supports one device 
> > > > > > model. If a
> > > > > > parent device supports multiple device models then each device 
> > > > > > model has an
> > > > > > mdev type. There may be multiple mdev types for a single device 
> > > > > > model when they
> > > > > > offer different migration parameters such as resource capacity or 
> > > > > > feature
> > > > > > availability.
> > > > > > 
> > > > > > For example, a graphics card that supports 4 GB and 8 GB device 
> > > > > > instances would
> > > > > > provide gfx-4GB and gfx-8GB mdev types with memory=4096 and 
> > > > > > memory=8192
> > > > > > migration parameters, respectively.      
> > > > > 
> > > > > 
> > > > > I think this example could be expanded for clarity.  I think this is
> > > > > suggesting we have mdev_types of gfx-4GB and gfx-8GB, which each
> > > > > implement some common device model, ie. com.gfx/GPU, where the
> > > > > migration parameter 'memory' for each defaults to a value matching the
> > > > > type name.  But it seems like this can also lead to some combinatorial
> > > > > challenges for management tools if these parameters are writable.  For
> > > > > example, should a management tool create a gfx-4GB device and change 
> > > > > to
> > > > > memory parameter to 8192 or a gfx-8GB device with the default 
> > > > > parameter?    
> > > > 
> > > > I would expect that the mdev types need to match in the first place.
> > > > What role would the memory= parameter play, then? Allowing gfx-4GB to
> > > > have memory=8192 feels wrong to me.    
> > > 
> > > Yes, I expected these mdev types to only accept a fixed "memory" value,
> > > but there's nothing stopping a driver author from making "memory" accept
> > > any value.  
> > 
> > I'm wondering how useful the memory parameter is, then. The layer
> > checking for compatibility can filter out inconsistent settings, but
> > why would we need to express something that is already implied in the
> > mdev type separately?  
> 
> To avoid tying device instances to specific mdev types. An mdev type is
> a device implementation, but the goal is to enable migration between
> device implementations (new/old or completely different
> implementations).
> 
> Imagine a new physical device that now offers variable memory because
> users found the static mdev types too constraining.  How do you migrate
> back and forth between new and old physical devices if the migration
> parameters don't describe the memory size? Migration parameters make it
> possible. Without them the management tool needs to hard-code knowledge
> of specific mdev types that support migration.

But doesn't the management tool *still* need to keep hardcoded
information about what the value of that memory parameter was for an
existing mdev type? If we have gfx-variable with a memory parameter,
fine; but if the target is supposed to accept a gfx-4GB device, it
should simply instantiate a gfx-4GB device.

I'm getting a bit worried about the complexity of the checking that
management software is supposed to perform. Is it really that bad to
restrict the models to a few, well-defined ones? Especially in the mdev
case, where we have control about what is getting instantiated?

Attachment: pgpxEvRwC6LeR.pgp
Description: OpenPGP digital signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]