qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH 01/18] vfio/migration: Add VFIO migration pre-copy support


From: Avihai Horon
Subject: Re: [PATCH 01/18] vfio/migration: Add VFIO migration pre-copy support
Date: Tue, 31 Jan 2023 14:44:54 +0200
User-agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:102.0) Gecko/20100101 Thunderbird/102.6.1


On 27/01/2023 1:52, Alex Williamson wrote:
External email: Use caution opening links or attachments


On Thu, 26 Jan 2023 20:49:31 +0200
Avihai Horon <avihaih@nvidia.com> wrote:

Pre-copy support allows the VFIO device data to be transferred while the
VM is running. This helps to accommodate VFIO devices that have a large
amount of data that needs to be transferred, and it can reduce migration
downtime.

Pre-copy support is optional in VFIO migration protocol v2.
Implement pre-copy of VFIO migration protocol v2 and use it for devices
that support it. Full description of it can be found here [1].

[1]
https://lore.kernel.org/kvm/20221206083438.37807-3-yishaih@nvidia.com/

Signed-off-by: Avihai Horon <avihaih@nvidia.com>
---
  docs/devel/vfio-migration.rst |  29 ++++++---
  include/hw/vfio/vfio-common.h |   3 +
  hw/vfio/common.c              |   8 ++-
  hw/vfio/migration.c           | 112 ++++++++++++++++++++++++++++++++--
  hw/vfio/trace-events          |   5 +-
  5 files changed, 140 insertions(+), 17 deletions(-)

diff --git a/docs/devel/vfio-migration.rst b/docs/devel/vfio-migration.rst
index 1d50c2fe5f..51f5e1a537 100644
--- a/docs/devel/vfio-migration.rst
+++ b/docs/devel/vfio-migration.rst
@@ -7,12 +7,14 @@ the guest is running on source host and restoring this saved 
state on the
  destination host. This document details how saving and restoring of VFIO
  devices is done in QEMU.

-Migration of VFIO devices currently consists of a single stop-and-copy phase.
-During the stop-and-copy phase the guest is stopped and the entire VFIO device
-data is transferred to the destination.
-
-The pre-copy phase of migration is currently not supported for VFIO devices.
-Support for VFIO pre-copy will be added later on.
+Migration of VFIO devices consists of two phases: the optional pre-copy phase,
+and the stop-and-copy phase. The pre-copy phase is iterative and allows to
+accommodate VFIO devices that have a large amount of data that needs to be
+transferred. The iterative pre-copy phase of migration allows for the guest to
+continue whilst the VFIO device state is transferred to the destination, this
+helps to reduce the total downtime of the VM. VFIO devices can choose to skip
+the pre-copy phase of migration by not reporting the VFIO_MIGRATION_PRE_COPY
+flag in VFIO_DEVICE_FEATURE_MIGRATION ioctl.

  A detailed description of the UAPI for VFIO device migration can be found in
  the comment for the ``vfio_device_mig_state`` structure in the header file
@@ -29,6 +31,12 @@ VFIO implements the device hooks for the iterative approach 
as follows:
    driver, which indicates the amount of data that the vendor driver has yet to
    save for the VFIO device.

+* An ``is_active_iterate`` function that indicates ``save_live_iterate`` is
+  active only if the VFIO device is in pre-copy states.
+
+* A ``save_live_iterate`` function that reads the VFIO device's data from the
+  vendor driver during iterative phase.
+
  * A ``save_state`` function to save the device config space if it is present.

  * A ``save_live_complete_precopy`` function that sets the VFIO device in
@@ -91,8 +99,10 @@ Flow of state changes during Live migration
  ===========================================

  Below is the flow of state change during live migration.
-The values in the brackets represent the VM state, the migration state, and
+The values in the parentheses represent the VM state, the migration state, and
  the VFIO device state, respectively.
+The text in the square brackets represents the flow if the VFIO device supports
+pre-copy.

  Live migration save path
  ------------------------
@@ -104,11 +114,12 @@ Live migration save path
                                    |
                       migrate_init spawns migration_thread
                  Migration thread then calls each device's .save_setup()
-                       (RUNNING, _SETUP, _RUNNING)
+                  (RUNNING, _SETUP, _RUNNING [_PRE_COPY])
                                    |
-                      (RUNNING, _ACTIVE, _RUNNING)
+                  (RUNNING, _ACTIVE, _RUNNING [_PRE_COPY])
               If device is active, get pending_bytes by .save_live_pending()
            If total pending_bytes >= threshold_size, call .save_live_iterate()
+                  [Data of VFIO device for pre-copy phase is copied]
          Iterate till total pending bytes converge and are less than threshold
                                    |
    On migration completion, vCPU stops and calls .save_live_complete_precopy 
for
diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-common.h
index 5f8e7a02fe..88c2194fb9 100644
--- a/include/hw/vfio/vfio-common.h
+++ b/include/hw/vfio/vfio-common.h
@@ -67,7 +67,10 @@ typedef struct VFIOMigration {
      int data_fd;
      void *data_buffer;
      size_t data_buffer_size;
+    uint64_t mig_flags;
      uint64_t stop_copy_size;
+    uint64_t precopy_init_size;
+    uint64_t precopy_dirty_size;
  } VFIOMigration;

  typedef struct VFIOAddressSpace {
diff --git a/hw/vfio/common.c b/hw/vfio/common.c
index 9a0dbee6b4..93b18c5e3d 100644
--- a/hw/vfio/common.c
+++ b/hw/vfio/common.c
@@ -357,7 +357,9 @@ static bool vfio_devices_all_dirty_tracking(VFIOContainer 
*container)

              if ((vbasedev->pre_copy_dirty_page_tracking == ON_OFF_AUTO_OFF) &&
                  (migration->device_state == VFIO_DEVICE_STATE_RUNNING ||
-                 migration->device_state == VFIO_DEVICE_STATE_RUNNING_P2P)) {
+                 migration->device_state == VFIO_DEVICE_STATE_RUNNING_P2P ||
+                 migration->device_state == VFIO_DEVICE_STATE_PRE_COPY ||
+                 migration->device_state == VFIO_DEVICE_STATE_PRE_COPY_P2P)) {
Should this just turn into a test that we're not in STOP_COPY?

Then we would need to check we are not in STOP_COPY and not in STOP.
The STOP check is for the case where PRE_COPY is not supported, since RAM will ask for dirty page sync when the device is in STOP state.
Without the STOP check, the device will skip the final dirty page sync.


                  return false;
              }
          }
@@ -387,7 +389,9 @@ static bool 
vfio_devices_all_running_and_mig_active(VFIOContainer *container)
              }

              if (migration->device_state == VFIO_DEVICE_STATE_RUNNING ||
-                migration->device_state == VFIO_DEVICE_STATE_RUNNING_P2P) {
+                migration->device_state == VFIO_DEVICE_STATE_RUNNING_P2P ||
+                migration->device_state == VFIO_DEVICE_STATE_PRE_COPY ||
+                migration->device_state == VFIO_DEVICE_STATE_PRE_COPY_P2P) {
                  continue;
              } else {
                  return false;
Hmm, this only seems to highlight that between this series and the
previous, we're adding tests for states that we never actually use, ie.
these _P2P states.

IIRC, the reason we have these _P2P states is so that we can transition
a set of devices, which may have active P2P DMA between them, to STOP,
STOP_COPY, and even RUNNING states safely without lost data given that
we cannot simultaneously transition all devices.  That suggest that
missing from both these series is support for bringing all devices to
these _P2P states before we move any device to one of STOP, STOP_COPY,
or RUNNING states (in the case of RESUMING).

Also, I recall discussions that we need to enforce configuration
restrictions when not all devices support the _P2P states?  For example
adding a migration blocker when there are multiple vfio devices and at
least one of them does not support _P2P migration states.  Or perhaps
initially, requiring support for _P2P states.

I think what's implemented here, where we don't make use of the _P2P
states would require adding a migration blocker whenever there are
multiple vfio devices, regardless of the device support for _P2P.

Yes, I think you are right.

I will add a migration blocker if there are multiple devices as part of v9 of the basic series. When P2P support is added, I will block migration of multiple devices if one or more of them doesn't support P2P.

Thanks.




reply via email to

[Prev in Thread] Current Thread [Next in Thread]