qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[RFC v3] VFIO Migration


From: Stefan Hajnoczi
Subject: [RFC v3] VFIO Migration
Date: Tue, 10 Nov 2020 09:53:49 +0000

v3:
 * Introduce migration info JSON to describe migration parameters
 * Rework mdev sysfs interface
 * Propose standard interface for vfio-user device emulation programs

VFIO Migration
==============
This document describes how to ensure migration compatibility for VFIO devices,
including mdev and vfio-user devices.

Overview
--------
VFIO devices can save and load a *device state*. Saving a device state produces
a snapshot of a VFIO device that can be loaded again at a later point in time
to resume the device from the snapshot.

The process of saving a device state and loading it later is called
*migration*. The device state may be loaded by the same device instance that
saved it or by a new instance, possibly running on a different machine.

A VFIO/mdev driver together with the physical device provides the functionality
of a device. Alternatively, a vfio-user device emulation program can provide
the functionality of a device. These are called *device implementations*.

The device implementation where a migration originates is called the *source*
and the device implementation that a migration targets is called the
*destination*.

Although it is possible to migrate device state without regard to migration
compatibility, this is prone to failure except in trivial cases. Device
implementations vary in feature availability and resource capacity so that it
is difficult to be confident that a migration of a complex device will succeed.
Furthermore, without migration compatibility checking it is possible that
migration appears to succeed but leaves the device in an inoperable state,
leading to data loss or corruption.

This document describes how to establish migration compatibility between the
source and destination. A check can be performed before migrating and can
therefore be used to select a suitable migration destination. When
compatibility has been established, the probability of migrating successfully
is high and a successful migration does not leave the device inoperable due to
silent migration problems.

Migration Parameters
--------------------
*Migration parameters* are used to describe characteristics that must match
between source and destination to achieve migration compatibility.

The first implementation of a simple device may not require migration
parameters if the source and destination are always compatible. As the device
evolves, the source and destination may differ and migration parameters are
required to express these differences. More complex devices may require
migration parameters from the start due to optional functionality that is not
guaranteed to be present in both source and destination.

A migration parameter consists of a name and a value. The name is a UTF-8
string that does not contain equals ('='), backslash ('/'), or whitespace
characters. The value is a UTF-8 string that does not contain newline
characters ('\n').

The meaning of the migration parameter and its possible values are specific to
the device, but values are based on one of the following types:
* bool - booleans (on/off)
* int - integers (0, 1, 2, ...)
* str - character strings

Migration parameters are represented as <name>=<value> in this document.
Examples include my-feature=on and num-queues=4.

When a new migration parameter is introduced, its absence must have the same
effect as before the migration parameter was introduced. For example, if
my-feature=on|off is added to control the availability of a new device feature,
then my-feature=off is equivalent to omitting the migration parameter.

Hardware Interface Compatibility
--------------------------------
VFIO devices have a *hardware interface* consisting of device regions and
interrupts. Aspects of the hardware interface can vary between device
implementations and require migration parameters to express migration
compatibility requirements.

Examples of migration parameters include:
* Feature availability - feature bitmasks, hardware revision numbers, etc. If
  the destination may lack support for optional features or hardware interface
  revisions, then migration parameters are required.
* Functionality - hardware register blocks that are only present on certain
  device instances. If there are multiple devices sub-models that have
  different hardware interfaces then migration parameters are required.
* Resource capacity - size of display framebuffers, number of queues, queue
  size, etc.

These examples demonstrate aspects of the hardware interface that must not
change unexpectedly. Were they to differ between source and destination, the
chance of device driver malfunction would be high because the layout of the
hardware interface would change or assumptions the device driver makes about
available functionality would be violated. Migration parameters are used to
preserve the hardware interface across migration and explicitly represent
variations between device implementations.

Hardware interfaces sometimes support reporting an event when a change occurs.
In those cases it may be possible to support visible changes in the hardware
interface across migration. In most other cases migration must not result in a
visible change in the hardware interface.

Migration parameters are not necessary for read-only values exposed through the
hardware interface, such as MAC address EEPROMs or serial numbers, so long as
all device implementations can be configured with the same range of input
values for these read-only values. This is possible because migration
parameters do not capture the full configuration of the device, only aspects
that affect migration compatibility.

Device configuration that is not visible through the hardware interface, such
as a host file system path of a disk image file or the physical network port
assigned to a network card, usually does not require migration parameters
because they can be changed without breaking migration compatibility.

The disk image file may indirectly affect the hardware interface, for example
by constraining the device's block size to a specific value. In this case a
block-size=N migration parameter is required to ensure migration compatibility,
but the host file system path of the disk image file still does not require a
migration parameter.

Device State Representation
---------------------------
Device state contains both data accessible through the device's hardware
interface and device-internal state needed to restore device operation.

The contents of hardware registers are usually included in the device state if
they can change at runtime. Hardware registers with constant or computed data
may not need to be part of the device state provided that device
implementations can produce the necessary data.

Device-internal state includes the portion of the device's state that cannot be
reconstructed from the hardware interface alone. Defining device-internal state
in the most general way instead of exposing device implementation details
allows for flexibility in the future. For example, device implementations often
maintain a ring index, which is not available through the hardware interface,
to keep track of which ring elements have already been consumed. The ring index
must be included in the device state so that the destination can resume
processing from the correct point in the ring. Representing this as an index
into the ring in the hardware interface is more general than adding device
implementation-specific request tracking data structures into the device state.

The *device state representation* defines the binary data layout of the device
state. The device state representation is specific to each device and is beyond
the scope of this document, but aspects pertaining to migration compatibility
are discussed here.

Each change to the device state representation that affects migration
compatibility requires a migration parameter. When a new field is added to the
device state representation then a new migration parameter must be added to
reflect this change. Often a single migration parameter expresses both a change
to the hardware interface and the device state representation. It is also
possible to change the device state representation without changing the
hardware interface, for example when some state was forgotten while designing
the previous device state representation.

The device state representation may support adding extra data that can be
safely ignored by old device implementations. In this case migration
compatibility is unaffected and a migration parameter is not required to
indicate such extra data has been added.

Device Models
-------------
The combination of the hardware interface, device state representation, and
migration parameter definitions is called a *device model*. Device models are
identified by a unique UTF-8 string starting with a domain name and followed by
path components separated with backslashes ('/'). Examples include
vendor-a.com/my-nic, gitlab.com/user/my-device, virtio-spec.org/pci/virtio-net,
and qemu.org/pci/10ec/8139.

The unique device model string is not changed as the device evolves. Instead,
migration parameters are added to express variations in a device.

The device model is not tied to a specific device implementation. The same
device model could be implemented as a VFIO/dev driver or as a vfio-user device
emulation program.

Multiple device implementations can support the same device model. Doing so
means that the device implementations can offer migration compatiblity because
they support the same hardware interface, device state representation, and
migration parameters.

Multiple device models can exist for the same hardware interface, each with a
different device state representation and migration parameters. This makes it
possible to fork and independently develop device models.

Orchestrating Migrations
------------------------
In order to migrate a device a *migration parameter list* must first be built
on the source. Each migration parameter is added to the list if it is in
effect. For example, the migration parameter list for a device with
new-feature=off,num-queues=4 would be num-queues=4 if the new-feature migration
parameter was introduced with the off value disabling its effect.

The following conditions must be met to establish migration compatibility:

1. The source and destination device model strings match.

2. Each migration parameter name from the migration parameter list is supported
   by the destination. For example, the destination supports the num-queues
   migration parameter.

3. Each migration parameter value from the migration parameter list is
   supported by the destination. For example, the destination supports
   num-queues=4.

The migration compatibility check can be performed without initiating a
migration. Therefore, this process can be used to select the migration
destination.

The following steps perform the migration:

1. Configure the destination so it is prepared to load the device state,
   including applying the migration parameter list. This may involve
   instantiating a new device instance or resetting an existing device instance
   to a configuration that is compatible with the source.

   The details of how to do this for VFIO/mdev drivers and vfio-user device
   backend programs is described below.

2. Save the device state on the source and load it on the destination.

3. If migration succeeds then the destination resumes operation and the source
   must not resume operation. If the migration fails then the source resumes
   operation and the destination must not resume operation.

Note that these steps impose a conservative bound on device states that can be
migrated successfully. Not all configuration parameters may be strictly
required to match on the source and destination devices. For example, if the
device's hardware interface has not yet been initialized then changes to the
advertised features may not yet affect the device driver. However, accurately
representing runtime constraints is complex and risks introducing migration
bugs, so no attempt is made to support them.

Migration Information JSON
--------------------------
Device implementations describe supported device models in the following JSON
format:

.. code:: json

  {
    "models": {
      "<model>": {
        "params": {
          "<param>": {
            "allowed_values": [<value1>, <value2>, ...]
            "description": ...
            "init_value": ...
            "off_value": ...
            "type": ...
          }
        }
      }
    }
  }

The "models" object contains one or more device model objects describing
available device models. Each member name is a unique device model string, for
example "vendor-a.com/my-nic".

The device model object contains a "params" object describes available
migration parameters. Each migration parameter object contains the following
members:

"allowed_values"
  The list all values that the device implementation accepts for this migration
  parameter. Integer ranges can be described using "<min>-<max>" strings.

  Examples: ['a', 'b', 'c'], [1, 5, 7], ['0-255', 512, '1024-2048'], [true]

  This member is optional. When absent, any value suitable for the type may be
  given but the device implementation may refuse certain values.

"description"
  A human-readable description of the migration parameter. This is not intended
  for user interfaces but rather as a troubleshooting aid for developers. The
  description is typically written in English. This member is optional.

"init_value"
  The initial parameter value when a device instance is created. This member is
  required.

"off_value"
  The parameter value that disables the effect of this parameter. This member
  is absent if the migration parameter cannot be disabled.

"type"
  The data type ("bool", "int", "str"). This member is required.

An example of a simple device model that only one feature and a fixed resource
capacity:

.. code:: json

  {
    "models": {
      "vendor-a.com/my-nic": {
        "params": {
          "new-feature": {
            "description": "New feature that old devices lack",
            "init_value": true,
            "off_value": false,
            "type": 'bool'
          },
          "num-resources": {
            "allowed_values": [64],
            "description": "Number of resources",
            "init_value": 64,
            "type": 'int'
          }
        }
      }
    }
  }

Newly created instances of this device will enable "new-feature", but it can be
disabled for migration compatibility with old device instances.

The number of device resources are fixed at 64, so only device instances that
also have exactly 64 resources can be migrated to this device implementation.

VFIO mdev Drivers
-----------------
The following mdev type sysfs attrs are available for managing device
instances::

  /sys/.../<parent-device>/mdev_supported_types/<type-id>/
    create - writing a UUID to this file instantiates a device
    migration_info.json - read-only migration information JSON

TODO The JSON can be represented as a file system hierarchy but sysfs seems
limited to <kobject>/<group>/<attr> and <kobject>/<attr> so it is not possible
to express deeper attr groups like <kobject>/migration/params/<param>/<attr>?

Device models supported by an mdev driver and their details can be read from
the migration_info.json attr. Each mdev type supports one device model. If a
parent device supports multiple device models then each device model has an
mdev type. There may be multiple mdev types for a single device model when they
offer different migration parameters such as resource capacity or feature
availability.

For example, a graphics card that supports 4 GB and 8 GB device instances would
provide gfx-4GB and gfx-8GB mdev types with memory=4096 and memory=8192
migration parameters, respectively.

The following mdev device sysfs attrs relate to a specific device instance::

  /sys/.../<parent-device>/<uuid>/
    mdev_type/ - symlink to mdev type sysfs attrs, e.g. to fetch migration/model
    migration/ - migration related files
      <param> - read/write migration parameter "param"
      ...

When the device is created all migration/<param> attrs take their
migration_info.json "init_value".

When preparing for migration on the source, each migration parameter from
migration/<param> is read and added to the migration parameter list if its
value differs from "off_value" in migration_info.json. If a migration parameter
in the list is not available on the destination, then migration is not
possible. If a migration parameter value is not in the destination
"allowed_values" migration_info.json then migration is not possible.

In order to prepare an mdev device instance for an incoming migration on the
destination, the "off_value" from migration_info.json is written to each
migration parameter in migration/<param>. Then the migration parameter list
from the source is written to migration/<param> one migration parameter at a
time. If an error occurs while writing a migration parameter on the destination
then migration is not possible. Once the migration parameter list has been
written the mdev can be opened and migration can proceed.

An open mdev device typically does not allow migration parameters to be changed
at runtime. However, certain migration/params attrs may allow writes at
runtime. Usually these migration parameters only affect the device state
representation and not the hardware interface. This makes it possible to
upgrade or downgrade the device state representation at runtime so that
migration is possible to newer or older device implementations.

vfio-user Device Emulation Programs
-----------------------------------
Device emulation programs often support a simple invocation model where running
the program creates a single device instance. The lifecycle of the device
instance is tied to the lifecycle of the process. Such device emulation
programs are described below.

More complex device emulation programs may host multiple devices. The interface
for configuring these device emulation programs is not standardized. Therefore,
migrating these devices is beyond the scope of this document.

The migration information JSON is printed to standard output by a vfio-user
device emulation program as follows:

.. code:: bash

  $ my-device --print-migration-info-json

The device is instantiated by launching the destination process with the
migration parameter list from the source:

.. code:: bash

  $ my-device --m-<param1>=<value1> --m-<param2> <value2> [...]

This example shows how to instantiate the device with migration parameters
``param1`` and ``param2``. Both ``--m-<param>=<value>`` and ``--m-<param>
<value>`` option formats are accepted.

The ``--m-`` prefix is used to allow the device emulation program to implement
device implementation-specific command-line options without conflicting with
the migration parameter namespace.

When preparing for migration on the source, each migration parameter from the
migration info JSON is added to the migration parameter list if its value
differs from "off_value". If a migration parameter in the list is not available
on the destination, then migration is not possible. If a migration parameter
value is not in the destination "allowed_values" migration_info.json then
migration is not possible.

On the destination, a command-line is generated from the migration parameter
list. For each destination migration parameter missing from the migration
parameter list a command-line option is added with the destination "off_value".
The device emulation program prints an error message to standard error and
terminates with exit status 1 if the device could not be instantiated.

Attachment: signature.asc
Description: PGP signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]