[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [PATCH 1/8] migration: Add precopy initial data capability
|
From: |
Markus Armbruster |
|
Subject: |
Re: [PATCH 1/8] migration: Add precopy initial data capability |
|
Date: |
Wed, 17 May 2023 14:21:54 +0200 |
|
User-agent: |
Gnus/5.13 (Gnus v5.13) Emacs/28.2 (gnu/linux) |
Avihai Horon <avihaih@nvidia.com> writes:
> On 17/05/2023 12:17, Markus Armbruster wrote:
>> External email: Use caution opening links or attachments
>>
>>
>> Avihai Horon <avihaih@nvidia.com> writes:
>>
>>> Migration downtime estimation is calculated based on bandwidth and
>>> remaining migration data. This assumes that loading of migration data in
>>> the destination takes a negligible amount of time and that downtime
>>> depends only on network speed.
>>>
>>> While this may be true for RAM, it's not necessarily true for other
>>> migration users. For example, loading the data of a VFIO device in the
>>> destination might require from the device to allocate resources, prepare
>>> internal data structures and so on. These operations can take a
>>> significant amount of time which can increase migration downtime.
>>>
>>> This patch adds a new capability "precopy initial data" that allows the
>>> source to send initial precopy data and the destination to ACK that this
>>> data has been loaded. Migration will not attempt to stop the source VM
>>> and complete the migration until this ACK is received.
>>>
>>> This will allow migration users to send initial precopy data which can
>>> be used to reduce downtime (e.g., by pre-allocating resources), while
>>> making sure that the source will stop the VM and complete the migration
>>> only after this initial precopy data is sent and loaded in the
>>> destination so it will have full effect.
>>>
>>> This new capability relies on the return path capability to communicate
>>> from the destination back to the source.
>>>
>>> The actual implementation of the capability will be added in the
>>> following patches.
>>>
>>> Signed-off-by: Avihai Horon <avihaih@nvidia.com>
>>> ---
>>> qapi/migration.json | 9 ++++++++-
>>> migration/options.h | 1 +
>>> migration/options.c | 20 ++++++++++++++++++++
>>> 3 files changed, 29 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/qapi/migration.json b/qapi/migration.json
>>> index 82000adce4..d496148386 100644
>>> --- a/qapi/migration.json
>>> +++ b/qapi/migration.json
>>> @@ -478,6 +478,13 @@
>>> # should not affect the correctness of postcopy
>>> migration.
>>> # (since 7.1)
>>> #
>>> +# @precopy-initial-data: If enabled, migration will not attempt to stop
>>> source
>>> +# VM and complete the migration until an ACK is
>>> received
>>> +# from the destination that initial precopy data has
>>> +# been loaded. This can improve downtime if there
>>> are
>>> +# migration users that support precopy initial data.
>>> +# (since 8.1)
>>> +#
>> Please format like
>>
>> # @precopy-initial-data: If enabled, migration will not attempt to
>> # stop source VM and complete the migration until an ACK is
>> # received from the destination that initial precopy data has been
>> # loaded. This can improve downtime if there are migration users
>> # that support precopy initial data. (since 8.1)
>>
>> to blend in with recent commit a937b6aa739 (qapi: Reformat doc comments
>> to conform to current conventions).
>
> Sure.
>
>>
>> What do you mean by "if there are migration users that support precopy
>> initial data"?
>
> This capability only provides the framework to send precopy initial data and
> ACK that it was loaded in the destination.
> To actually benefit from it, migration users (such as VFIO devices, RAM,
> etc.) must implement support for it and use it.
>
> What I wanted to say here is that there is no point to enable this capability
> if there are no migration users that support it.
> For example, if you are migrating a VM without VFIO devices, then enabling
> this capability will have no effect.
I see.
Which "migration users" support it now?
Which could support it in the future?
Is the "initial precopy data" feature described in more detail anywhere?
>> Do I have to ensure the ACK comes by configuring the destination VM in a
>> certain way, and if yes, how exactly?
>
> In v2 of the series that I will send later you will have to enable this
> capability also in the destination.
What happens when you enable it on the source and not on the
destination?
[...]
[PATCH 2/8] migration: Add precopy initial data handshake, Avihai Horon, 2023/05/01
[PATCH 3/8] migration: Add precopy initial data loaded ACK functionality, Avihai Horon, 2023/05/01