qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [RFC v3] VFIO Migration


From: Cornelia Huck
Subject: Re: [RFC v3] VFIO Migration
Date: Wed, 11 Nov 2020 16:35:43 +0100

On Wed, 11 Nov 2020 15:14:49 +0000
Stefan Hajnoczi <stefanha@redhat.com> wrote:

> On Wed, Nov 11, 2020 at 12:48:53PM +0100, Cornelia Huck wrote:
> > On Tue, 10 Nov 2020 13:14:04 -0700
> > Alex Williamson <alex.williamson@redhat.com> wrote:  
> > > On Tue, 10 Nov 2020 09:53:49 +0000
> > > Stefan Hajnoczi <stefanha@redhat.com> wrote:  
> >   
> > > > Device models supported by an mdev driver and their details can be read 
> > > > from
> > > > the migration_info.json attr. Each mdev type supports one device model. 
> > > > If a
> > > > parent device supports multiple device models then each device model 
> > > > has an
> > > > mdev type. There may be multiple mdev types for a single device model 
> > > > when they
> > > > offer different migration parameters such as resource capacity or 
> > > > feature
> > > > availability.
> > > > 
> > > > For example, a graphics card that supports 4 GB and 8 GB device 
> > > > instances would
> > > > provide gfx-4GB and gfx-8GB mdev types with memory=4096 and memory=8192
> > > > migration parameters, respectively.    
> > > 
> > > 
> > > I think this example could be expanded for clarity.  I think this is
> > > suggesting we have mdev_types of gfx-4GB and gfx-8GB, which each
> > > implement some common device model, ie. com.gfx/GPU, where the
> > > migration parameter 'memory' for each defaults to a value matching the
> > > type name.  But it seems like this can also lead to some combinatorial
> > > challenges for management tools if these parameters are writable.  For
> > > example, should a management tool create a gfx-4GB device and change to
> > > memory parameter to 8192 or a gfx-8GB device with the default parameter?  
> > 
> > I would expect that the mdev types need to match in the first place.
> > What role would the memory= parameter play, then? Allowing gfx-4GB to
> > have memory=8192 feels wrong to me.  
> 
> Yes, I expected these mdev types to only accept a fixed "memory" value,
> but there's nothing stopping a driver author from making "memory" accept
> any value.

I'm wondering how useful the memory parameter is, then. The layer
checking for compatibility can filter out inconsistent settings, but
why would we need to express something that is already implied in the
mdev type separately?

> 
> > > > An open mdev device typically does not allow migration parameters to be 
> > > > changed
> > > > at runtime. However, certain migration/params attrs may allow writes at
> > > > runtime. Usually these migration parameters only affect the device state
> > > > representation and not the hardware interface. This makes it possible to
> > > > upgrade or downgrade the device state representation at runtime so that
> > > > migration is possible to newer or older device implementations.    
> > 
> > This refers to generation of device implementations, but not to dynamic
> > configuration changes. Maybe I'm just confused by this sentence, but
> > how are we supposed to get changes while the mdev is live across?  
> 
> This is about dynamic configuration changes. For example, if a field was
> forgotten in the device state representation then a migration parameter
> can be added to enable the fix. When the parameter is off the device
> state is incomplete but migration to old device implementations still
> works. An old device can be migrated to a new device implementation with
> the parameter turned off. And then you can safely enable the migration
> parameter at runtime without powering off the guest because it's purely
> a device state representation change, not a hardware interface change
> that would disturb the guest.
> 
> This is kind of similar to QEMU migration subsections.

Ok, I was a bit confused here.

So, we build the stream with the then-current parameters? How is the
compat-checking layer supposed to deal with parameters changing after
the check -- is it a "you get to keep the pieces" situation?

Attachment: pgpuULXJhhd0E.pgp
Description: OpenPGP digital signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]