qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH 2/2] failover: don't allow to migrate a paused VM that needs


From: Juan Quintela
Subject: Re: [PATCH 2/2] failover: don't allow to migrate a paused VM that needs PCI unplug
Date: Tue, 2 Nov 2021 19:09:13 +0100



On Tue, Nov 2, 2021, 18:47 Laurent Vivier <lvivier@redhat.com> wrote:
On 02/11/2021 18:26, Juan Quintela wrote:
> "Michael S. Tsirkin" <mst@redhat.com> wrote:
>> On Tue, Nov 02, 2021 at 06:06:51PM +0100, Laurent Vivier wrote:
>>> On 02/11/2021 16:04, Michael S. Tsirkin wrote:
>>>> On Wed, Sep 29, 2021 at 04:43:11PM +0200, Laurent Vivier wrote:
>>>>> As the guest OS is paused, we will never receive the unplug event
>>>>> from the kernel and the migration cannot continue.
>>>>>
>>>>> Signed-off-by: Laurent Vivier <lvivier@redhat.com>
>>>>
>>>> Well ... what if user previously did
>>>>
>>>> pause
>>>> start migration
>>>> unpause
>>>>
>>>> we are breaking it now for no good reason.
>>>>
>>>> Further, how about
>>>>
>>>> start migration
>>>> pause
>>>>
>>>> are we going to break this too? by failing pause?
>>>>
>>>>
>>>
>>> TL;DR: This patch only prevents to migrate a VFIO device as failover allows
>>> to start a migration with a VFIO device plugged in.
>>>
>>> Long Story:
>>>
>>> * before this patch:
>>>
>>> - pause and start migration and unpause-> fails if we unpause too late
>>> because we migrate a VFIO device, works otherwise
>>
>>
>> confused about this one. can you explain pls?
>
> Pause the guest.
> Start migration.
>
>       if (migration_in_setup(s) && !should_be_hidden) {
>          if (failover_unplug_primary(n, dev)) {
>               vmstate_unregister(VMSTATE_IF(dev), qdev_get_vmsd(dev), dev);
>               qapi_event_send_unplug_primary(dev->id);
>
> We send the unplug request, but the guest is paused.
>
>               qatomic_set(&n->failover_primary_hidden, true);
>
> callbacks, callbacks, callbacks.
>
>          while (s->state == MIGRATION_STATUS_WAIT_UNPLUG &&
>                 qemu_savevm_state_guest_unplug_pending()) {
>              qemu_sem_timedwait(&s->wait_unplug_sem, 250);
>          }
>
> And we are not able to get out of that loop, because we never get to the
> point where the guest send the unplug command.
>
> So, the only other thing that I can think of is putting one timeout
> there, but how much?  That is a good question.
>

Please, no timeout, IMHO timeout is worse than a clean exit on failure.


How long should we wait for the guest? If not a timeout....


Thanks,
Laurent


reply via email to

[Prev in Thread] Current Thread [Next in Thread]