|
From: | Juan Quintela |
Subject: | Re: [PATCH 2/2] failover: don't allow to migrate a paused VM that needs PCI unplug |
Date: | Tue, 2 Nov 2021 19:09:13 +0100 |
On 02/11/2021 18:26, Juan Quintela wrote:
> "Michael S. Tsirkin" <mst@redhat.com> wrote:
>> On Tue, Nov 02, 2021 at 06:06:51PM +0100, Laurent Vivier wrote:
>>> On 02/11/2021 16:04, Michael S. Tsirkin wrote:
>>>> On Wed, Sep 29, 2021 at 04:43:11PM +0200, Laurent Vivier wrote:
>>>>> As the guest OS is paused, we will never receive the unplug event
>>>>> from the kernel and the migration cannot continue.
>>>>>
>>>>> Signed-off-by: Laurent Vivier <lvivier@redhat.com>
>>>>
>>>> Well ... what if user previously did
>>>>
>>>> pause
>>>> start migration
>>>> unpause
>>>>
>>>> we are breaking it now for no good reason.
>>>>
>>>> Further, how about
>>>>
>>>> start migration
>>>> pause
>>>>
>>>> are we going to break this too? by failing pause?
>>>>
>>>>
>>>
>>> TL;DR: This patch only prevents to migrate a VFIO device as failover allows
>>> to start a migration with a VFIO device plugged in.
>>>
>>> Long Story:
>>>
>>> * before this patch:
>>>
>>> - pause and start migration and unpause-> fails if we unpause too late
>>> because we migrate a VFIO device, works otherwise
>>
>>
>> confused about this one. can you explain pls?
>
> Pause the guest.
> Start migration.
>
> if (migration_in_setup(s) && !should_be_hidden) {
> if (failover_unplug_primary(n, dev)) {
> vmstate_unregister(VMSTATE_IF(dev), qdev_get_vmsd(dev), dev);
> qapi_event_send_unplug_primary(dev->id);
>
> We send the unplug request, but the guest is paused.
>
> qatomic_set(&n->failover_primary_hidden, true);
>
> callbacks, callbacks, callbacks.
>
> while (s->state == MIGRATION_STATUS_WAIT_UNPLUG &&
> qemu_savevm_state_guest_unplug_pending()) {
> qemu_sem_timedwait(&s->wait_unplug_sem, 250);
> }
>
> And we are not able to get out of that loop, because we never get to the
> point where the guest send the unplug command.
>
> So, the only other thing that I can think of is putting one timeout
> there, but how much? That is a good question.
>
Please, no timeout, IMHO timeout is worse than a clean exit on failure.
Thanks,
Laurent
[Prev in Thread] | Current Thread | [Next in Thread] |