[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [PATCH v2 12/17] migration/multifd: Device state transfer support -
|
From: |
Fabiano Rosas |
|
Subject: |
Re: [PATCH v2 12/17] migration/multifd: Device state transfer support - send side |
|
Date: |
Fri, 30 Aug 2024 10:02:40 -0300 |
"Maciej S. Szmigiero" <mail@maciej.szmigiero.name> writes:
> On 29.08.2024 02:41, Fabiano Rosas wrote:
>> "Maciej S. Szmigiero" <mail@maciej.szmigiero.name> writes:
>>
>>> From: "Maciej S. Szmigiero" <maciej.szmigiero@oracle.com>
>>>
>>> A new function multifd_queue_device_state() is provided for device to queue
>>> its state for transmission via a multifd channel.
>>>
>>> Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com>
>>> ---
>>> include/migration/misc.h | 4 ++
>>> migration/meson.build | 1 +
>>> migration/multifd-device-state.c | 99 ++++++++++++++++++++++++++++++++
>>> migration/multifd-nocomp.c | 6 +-
>>> migration/multifd-qpl.c | 2 +-
>>> migration/multifd-uadk.c | 2 +-
>>> migration/multifd-zlib.c | 2 +-
>>> migration/multifd-zstd.c | 2 +-
>>> migration/multifd.c | 65 +++++++++++++++------
>>> migration/multifd.h | 29 +++++++++-
>>> 10 files changed, 184 insertions(+), 28 deletions(-)
>>> create mode 100644 migration/multifd-device-state.c
>>>
>>> diff --git a/include/migration/misc.h b/include/migration/misc.h
>>> index bfadc5613bac..7266b1b77d1f 100644
>>> --- a/include/migration/misc.h
>>> +++ b/include/migration/misc.h
>>> @@ -111,4 +111,8 @@ bool migration_in_bg_snapshot(void);
>>> /* migration/block-dirty-bitmap.c */
>>> void dirty_bitmap_mig_init(void);
>>>
>>> +/* migration/multifd-device-state.c */
>>> +bool multifd_queue_device_state(char *idstr, uint32_t instance_id,
>>> + char *data, size_t len);
>>> +
>>> #endif
>>> diff --git a/migration/meson.build b/migration/meson.build
>>> index 77f3abf08eb1..00853595894f 100644
>>> --- a/migration/meson.build
>>> +++ b/migration/meson.build
>>> @@ -21,6 +21,7 @@ system_ss.add(files(
>>> 'migration-hmp-cmds.c',
>>> 'migration.c',
>>> 'multifd.c',
>>> + 'multifd-device-state.c',
>>> 'multifd-nocomp.c',
>>> 'multifd-zlib.c',
>>> 'multifd-zero-page.c',
>>> diff --git a/migration/multifd-device-state.c
>>> b/migration/multifd-device-state.c
>>> new file mode 100644
>>> index 000000000000..c9b44f0b5ab9
>>> --- /dev/null
>>> +++ b/migration/multifd-device-state.c
>>> @@ -0,0 +1,99 @@
>>> +/*
>>> + * Multifd device state migration
>>> + *
>>> + * Copyright (C) 2024 Oracle and/or its affiliates.
>>> + *
>>> + * This work is licensed under the terms of the GNU GPL, version 2 or
>>> later.
>>> + * See the COPYING file in the top-level directory.
>>> + */
>>> +
>>> +#include "qemu/osdep.h"
>>> +#include "qemu/lockable.h"
>>> +#include "migration/misc.h"
>>> +#include "multifd.h"
>>> +
>>> +static QemuMutex queue_job_mutex;
>>> +
>>> +static MultiFDSendData *device_state_send;
>>> +
>>> +size_t multifd_device_state_payload_size(void)
>>> +{
>>> + return sizeof(MultiFDDeviceState_t);
>>> +}
>>
>> This will not be necessary because the payload size is the same as the
>> data type. We only need it for the special case where the MultiFDPages_t
>> is smaller than the total ram payload size.
>
> I know - I just wanted to make the API consistent with the one RAM
> handler provides since these multifd_send_data_alloc() calls are done
> just once per migration - it isn't any kind of a hot path.
>
I think the array at the end of MultiFDPages_t should be considered
enough of a hack that we might want to keep anything related to it
outside of the interface. Other clients shouldn't have to think about
that at all.
>>> @@ -397,20 +404,16 @@ bool multifd_send(MultiFDSendData **send_data)
>>>
>>> p = &multifd_send_state->params[i];
>>> /*
>>> - * Lockless read to p->pending_job is safe, because only multifd
>>> - * sender thread can clear it.
>>> + * Lockless RMW on p->pending_job_preparing is safe, because only
>>> multifd
>>> + * sender thread can clear it after it had seen p->pending_job
>>> being set.
>>> + *
>>> + * Pairs with qatomic_store_release() in multifd_send_thread().
>>> */
>>> - if (qatomic_read(&p->pending_job) == false) {
>>> + if (qatomic_cmpxchg(&p->pending_job_preparing, false, true) ==
>>> false) {
>>
>> What's the motivation for this change? It would be better to have it in
>> a separate patch with a proper justification.
>
> The original RFC patch set used dedicated device state multifd channels.
>
> Peter and other people wanted this functionality removed, however this caused
> a performance (downtime) regression.
>
> One of the things that seemed to help mitigate this regression was making
> the multifd channel selection more fair via this change.
>
> But I can split out it to a separate commit in the next patch set version and
> then see what performance improvement it currently brings.
Yes, better to have it separate if anything for documentation of the
rationale.
- [PATCH v2 10/17] migration/multifd: Convert multifd_send()::next_channel to atomic, (continued)
- [PATCH v2 10/17] migration/multifd: Convert multifd_send()::next_channel to atomic, Maciej S. Szmigiero, 2024/08/27
- [PATCH v2 11/17] migration/multifd: Add an explicit MultiFDSendData destructor, Maciej S. Szmigiero, 2024/08/27
- [PATCH v2 13/17] migration/multifd: Add migration_has_device_state_support(), Maciej S. Szmigiero, 2024/08/27
- [PATCH v2 16/17] vfio/migration: Add x-migration-multifd-transfer VFIO property, Maciej S. Szmigiero, 2024/08/27
- [PATCH v2 14/17] migration: Add save_live_complete_precopy_thread handler, Maciej S. Szmigiero, 2024/08/27
- [PATCH v2 17/17] vfio/migration: Multifd device state transfer support - send side, Maciej S. Szmigiero, 2024/08/27
- [PATCH v2 12/17] migration/multifd: Device state transfer support - send side, Maciej S. Szmigiero, 2024/08/27
- [PATCH v2 15/17] vfio/migration: Multifd device state transfer support - receive side, Maciej S. Szmigiero, 2024/08/27
- Re: [PATCH v2 00/17] Multifd 🔀 device state transfer support with VFIO consumer, Fabiano Rosas, 2024/08/28