[Qemu-devel] [PATCH v4 0/7] migration: pause-before-switchover

From: Dr. David Alan Gilbert (git)
Date: Fri, 20 Oct 2017 10:05:49 +0100

From: "Dr. David Alan Gilbert" <address@hidden>

  This set attempts to make a race condition between migration and
drive-mirror (and other block users) soluble by allowing the migration
to be paused after the source qemu releases the block devices but
before the serialisation of the device state.

The symptom of this failure, as reported by Wangjie, is a:
   _co_do_pwritev: Assertion `!(bs->open_flags & 0x0800)' failed

and the source qemu dieing; so the problem is pretty nasty.
This has only been seen on 2.9 onwards, but the theory is that
prior to 2.9 it might have been happening anyway and we were
perhaps getting unreported corruptions (lost writes); so this
really needs fixing.

This flow came from discussions between Kevin and me, and we can't
see a way of fixing it without exposing a new state to the management

The flow is now:

(qemu) migrate_set_capability pause-before-switchover on
(qemu) migrate -d ...
(qemu) info migrate
Migration status: pre-switchover
<< issue commands to clean up any block jobs>>

(qemu) migrate_continue pre-switchover
(qemu) info migrate
Migration status: completed

This has been tested with Jiri's libvirt at:
  https://gitlab.com/jirkade/libvirt.git migration-pause
  migrate --live --copy-storage-all --verbose

The precopy flow is:

The postcopy flow is:

Although the behaviour with postcopy only gets interesting when
we add something like Max's active-sync.


  Comment fix in 'migrate-continue' example (thanks Jiri)

  A couple of FIXUPs that had escaped v2's merge

  Pause *before* block inactivation (thanks Peter)
  Rename state and capability to Dan+KWolf's combined suggestion

Dr. David Alan Gilbert (7):
  migration: Add 'pause-before-switchover' capability
  migration: Add 'pre-switchover' and 'device' statuses
  migration: Wait for semaphore before completing migration
  migration: migrate-continue
  migrate: HMP migate_continue
  migration: allow cancel to unpause
  migration: pause-before-switchover for postcopy

 hmp-commands.hx       | 12 +++++++
 hmp.c                 | 13 ++++++++
 hmp.h                 |  1 +
 migration/migration.c | 88 +++++++++++++++++++++++++++++++++++++++++++++++++--
 migration/migration.h |  4 +++
 qapi/migration.json   | 30 ++++++++++++++++--
 6 files changed, 144 insertions(+), 4 deletions(-)


