qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [RFC PATCH 0/9] migration/snap-tool: External snapshot utility


From: Andrey Gruzdev
Subject: Re: [RFC PATCH 0/9] migration/snap-tool: External snapshot utility
Date: Mon, 29 Mar 2021 11:11:03 +0300
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.3.2

Ping

On 17.03.2021 19:32, Andrey Gruzdev wrote:

This series is a kind of PoC for asynchronous snapshot reverting. This is
about external snapshots only and doesn't involve block devices. Thus, it's
mainly intended to be used with the new 'background-snapshot' migration
capability and otherwise standard QEMU migration mechanism.

The major ideas behind this first version were:
  * Make it compatible with 'exec:'-style migration - options can be create
    some separate tool or integrate into qemu-system.
  * Support asynchronous revert stage by using unaltered postcopy logic
    at destination. To do this, we should be capable of saving RAM pages
    so that any particular page can be directly addressed by it's block ID
    and page offset. Possible solutions here seem to be:
      use separate index (and storing it somewhere)
      create sparse file on host FS and address pages with file offset
      use QCOW2 (or other) image container with inherent sparsity support
  * Make snapshot image file dense on the host FS so we don't depend on
    copy/backup tools and how they deal with sparse files. Off course,
    there's some performance cost for this choice.
  * Make the code which is parsing unstructered format of migration stream,
    at least, not very sophisticated. Also, try to have minimum dependencies
    on QEMU migration code, both RAM and device.
  * Try to keep page save latencies small while not degrading migration
    bandwidth too much.

For this first version I decided not to integrate into main QEMU code but
create a separate tool. The main reason is that there's not too much migration
code that is target-specific and can be used in it's unmodified form. Also,
it's still not very clear how to make 'qemu-system' integration in terms of
command-line (or monitor/QMP?) interface extension.

For the storage format, QCOW2 as a container and large (1MB) cluster size seem
to be an optimal choice. Larger cluster is beneficial for performance particularly
in the case when image preallocation is disabled. Such cluster size does not result
in too high internal fragmentation level (~10% of space waste in most cases) yet
allows to reduce significantly the number of expensive cluster allocations.

A bit tricky part is dispatching QEMU migration stream cause it is mostly
unstructered and depends on configuration parameters like 'send-configuration'
and 'send-section-footer'. But, for the case with default values in migration
globals it seems that implemented dispatching code works well and won't have
compatibility issues in a reasonably long time frame.

I decided to keep RAM save path synchronous, anyhow it's better to use writeback
cache mode for the live snapshots cause of it's interleaving page address pattern.
Page coalescing buffer is used to merge contiguous pages to optimize block layer
writes.

Since for snapshot loading opening image file in cached mode would not do any good,
it implies that Linux native AIO and O_DIRECT mode is used in a common scenario.
AIO support in RAM loading path is implemented by using a ring of preallocated
fixed-sized buffers in such a way that there's always a number of outstanding block
requests anytime. It also ensures in-order request completion.

How to use:

**Save:**
* qemu> migrate_set_capability background-snapshot on
* qemu> migrate "exec:<qemu-bin-path>/qemu-snap -s <virtual-size>
           --cache=writeback --aio=threads save <image-file.qcow2>"

**Load:**
* Use 'qemu-system-* -incoming defer'
* qemu> migrate_incoming "exec:<qemu-bin-path>/qemu-snap
          --cache=none --aio=native load <image-file.qcow2>"

**Load with postcopy:**
* Use 'qemu-system-* -incoming defer'
* qemu> migrate_set_capability postcopy-ram on
* qemu> migrate_incoming "exec:<qemu-bin-path>/qemu-snap --postcopy=60
          --cache=none --aio=native load <image-file.qcow2>"

And yes, asynchronous revert works well only with SSD, not with rotational disk..

Some performance stats:
* SATA SSD drive with ~500/450 MB/s sequantial read/write and ~60K IOPS max.
* 220 MB/s average save rate (depends on workload)
* 440 MB/s average load rate in precopy
* 260 MB/s average load rate in postcopy

Andrey Gruzdev (9):
  migration/snap-tool: Introduce qemu-snap tool
  migration/snap-tool: Snapshot image create/open routines for qemu-snap
    tool
  migration/snap-tool: Preparations to run code in main loop context
  migration/snap-tool: Introduce qemu_ftell2() routine to qemu-file.c
  migration/snap-tool: Block layer AIO support and file utility routines
  migration/snap-tool: Move RAM_SAVE_FLAG_xxx defines to migration/ram.h
  migration/snap-tool: Complete implementation of snapshot saving
  migration/snap-tool: Implementation of snapshot loading in precopy
  migration/snap-tool: Implementation of snapshot loading in postcopy

 include/qemu-snap.h   |  163 ++++
 meson.build           |    2 +
 migration/qemu-file.c |    6 +
 migration/qemu-file.h |    1 +
 migration/ram.c       |   16 -
 migration/ram.h       |   16 +
 qemu-snap-handlers.c  | 1801 +++++++++++++++++++++++++++++++++++++++++
 qemu-snap-io.c        |  325 ++++++++
 qemu-snap.c           |  673 +++++++++++++++
 9 files changed, 2987 insertions(+), 16 deletions(-)
 create mode 100644 include/qemu-snap.h
 create mode 100644 qemu-snap-handlers.c
 create mode 100644 qemu-snap-io.c
 create mode 100644 qemu-snap.c



-- 
Andrey Gruzdev, Principal Engineer
Virtuozzo GmbH  +7-903-247-6397
                virtuzzo.com

reply via email to

[Prev in Thread] Current Thread [Next in Thread]