[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH 2/2] failover: don't allow to migrate a paused VM that needs

From: Laurent Vivier
Subject: Re: [PATCH 2/2] failover: don't allow to migrate a paused VM that needs PCI unplug
Date: Tue, 2 Nov 2021 18:47:30 +0100
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.2.0

On 02/11/2021 18:26, Juan Quintela wrote:
"Michael S. Tsirkin" <mst@redhat.com> wrote:
On Tue, Nov 02, 2021 at 06:06:51PM +0100, Laurent Vivier wrote:
On 02/11/2021 16:04, Michael S. Tsirkin wrote:
On Wed, Sep 29, 2021 at 04:43:11PM +0200, Laurent Vivier wrote:
As the guest OS is paused, we will never receive the unplug event
from the kernel and the migration cannot continue.

Signed-off-by: Laurent Vivier <lvivier@redhat.com>

Well ... what if user previously did

start migration

we are breaking it now for no good reason.

Further, how about

start migration

are we going to break this too? by failing pause?

TL;DR: This patch only prevents to migrate a VFIO device as failover allows
to start a migration with a VFIO device plugged in.

Long Story:

* before this patch:

- pause and start migration and unpause-> fails if we unpause too late
because we migrate a VFIO device, works otherwise

confused about this one. can you explain pls?

Pause the guest.
Start migration.

      if (migration_in_setup(s) && !should_be_hidden) {
         if (failover_unplug_primary(n, dev)) {
              vmstate_unregister(VMSTATE_IF(dev), qdev_get_vmsd(dev), dev);

We send the unplug request, but the guest is paused.

              qatomic_set(&n->failover_primary_hidden, true);

callbacks, callbacks, callbacks.

         while (s->state == MIGRATION_STATUS_WAIT_UNPLUG &&
                qemu_savevm_state_guest_unplug_pending()) {
             qemu_sem_timedwait(&s->wait_unplug_sem, 250);

And we are not able to get out of that loop, because we never get to the
point where the guest send the unplug command.

So, the only other thing that I can think of is putting one timeout
there, but how much?  That is a good question.

Please, no timeout, IMHO timeout is worse than a clean exit on failure.


reply via email to

[Prev in Thread] Current Thread [Next in Thread]