Re: [Qemu-devel] [PATCH v3] XBZRLE delta for live migration of large mem

qemu-devel

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH v3] XBZRLE delta for live migration of large mem

From:	Stefan Hajnoczi
Subject:	Re: [Qemu-devel] [PATCH v3] XBZRLE delta for live migration of large memory apps
Date:	Tue, 2 Aug 2011 19:08:01 +0100
User-agent:	Mutt/1.5.21 (2010-09-15)

On Tue, Aug 02, 2011 at 04:01:06PM +0200, Alexander Graf wrote:
> 
> On 02.08.2011, at 15:45, Shribman, Aidan wrote:
> 
> > Subject: [PATCH v3] XBZRLE delta for live migration of large memory apps
> > From: Aidan Shribman <address@hidden>
> > 
> > By using XBZRLE (Xor Binary Zero Run-Length-Encoding) we can reduce VM 
> > downtime
> > and total live-migration time for VMs running memory write intensive 
> > workloads
> > typical of large enterprise applications such as SAP ERP Systems, and 
> > generally
> > speaking for representative of any application with a sparse memory update 
> > pattern.
> > 
> > On the sender side XBZRLE is used as a compact delta encoding of page 
> > updates,
> > retrieving the old page content from an LRU cache (default size of 64 MB). 
> > The
> > receiving side uses the existing page content and XBZRLE to decode the new 
> > page
> > content.
> > 
> > Work was originally based on research results published VEE 2011: 
> > Evaluation of
> > Delta Compression Techniques for Efficient Live Migration of Large Virtual
> > Machines by Benoit, Svard, Tordsson and Elmroth. Additionally the delta 
> > encoder
> > XBRLE was improved further using XBZRLE instead.
> > 
> > XBZRLE has a sustained bandwidth of 1.5-2.2 GB/s for typical workloads 
> > making it
> > ideal for in-line, real-time encoding such as is needed for live-migration.
> > 
> > A typical usage scenario:
> >    {qemu} migrate_set_cachesize 256m
> >    {qemu} migrate -x -d tcp:destination.host:4444
> >    {qemu} info migrate
> >    ...
> >    transferred ram-duplicate: A kbytes
> >    transferred ram-duplicate: B pages
> >    transferred ram-normal: C kbytes
> >    transferred ram-normal: D pages
> >    transferred ram-xbrle: E kbytes
> >    transferred ram-xbrle: F pages
> >    overflow ram-xbrle: G pages
> >    cache-hit ram-xbrle: H pages
> >    cache-lookup ram-xbrle: J pages
> > 
> > Testing: live migration with XBZRLE completed in 110 seconds, without live
> > migration was not able to complete.
> > 
> > A simple synthetic memory r/w load generator:
> > ..    include <stdlib.h>
> > ..    include <stdio.h>
> > ..    int main()
> > ..    {
> > ..        char *buf = (char *) calloc(4096, 4096);
> > ..        while (1) {
> > ..            int i;
> > ..            for (i = 0; i < 4096 * 4; i++) {
> > ..                buf[i * 4096 / 4]++;
> > ..            }
> > ..            printf(".");
> > ..        }
> > ..    }
> > 
> > Signed-off-by: Benoit Hudzia <address@hidden>
> > Signed-off-by: Petter Svard <address@hidden>
> > Signed-off-by: Aidan Shribman <address@hidden>
> 
> 
> So if I understand correctly, this enabled delta updates for dirty pages? 
> Would it be possible to do the same on the block layer, so that VM backing 
> file data could potentially save the new information as delta over the old 
> block? Especially with metadata updates, that could save quite some disk 
> space.
> 
> Of course that would mean that a block is no longer the size of a block :). 
> Maybe something to consider for qcow3?

This is a good idea for a transport format but I think it would
noticably degrade the I/O performance of a running VM.  Some file
systems also provide compression but it is rarely used.  The use-case is
basically "Write Once Read Many" archiving.  In other scenarios I don't
think this will work well.

I/O request size is restricted to multiples of the host device blocksize
(e.g. 512 bytes or 4 KB).  Because of this it isn't trivial to pack
sub-blocksized data.

Since disk I/O is slow the image format either needs to be simple or use
a significantly superior data structure that makes up for the additional
metadata.

VMDK has a "stream optimized" metadata format and QCOW2 supports
compression but I don't think they do delta compression.  Also there may
be limitations on how compact the image file stays when you rewrite
data.

Stefan

[Prev in Thread]

Current Thread

[Next in Thread]

[Qemu-devel] [PATCH v3] XBZRLE delta for live migration of large memory apps, Shribman, Aidan, 2011/08/02
- Re: [Qemu-devel] [PATCH v3] XBZRLE delta for live migration of large memory apps, Alexander Graf, 2011/08/02
  - Re: [Qemu-devel] [PATCH v3] XBZRLE delta for live migration of large memory apps, Paolo Bonzini, 2011/08/02
  - Re: [Qemu-devel] [PATCH v3] XBZRLE delta for live migration of large memory apps, Anthony Liguori, 2011/08/02
    - Re: [Qemu-devel] [PATCH v3] XBZRLE delta for live migration of large memory apps, Shribman, Aidan, 2011/08/04
  - Re: [Qemu-devel] [PATCH v3] XBZRLE delta for live migration of large memory apps, Stefan Hajnoczi <=
  - Re: [Qemu-devel] [PATCH v3] XBZRLE delta for live migration of large memory apps, Avi Kivity, 2011/08/02
- Re: [Qemu-devel] [PATCH v3] XBZRLE delta for live migration of large memory apps, Stefan Hajnoczi, 2011/08/02
  - Re: [Qemu-devel] [PATCH v3] XBZRLE delta for live migration of large memory apps, Shribman, Aidan, 2011/08/08
    - Re: [Qemu-devel] [PATCH v3] XBZRLE delta for live migration of large memory apps, Stefan Hajnoczi, 2011/08/08
- Re: [Qemu-devel] [PATCH v3] XBZRLE delta for live migration of large memory apps, Blue Swirl, 2011/08/02
  - Re: [Qemu-devel] [PATCH v3] XBZRLE delta for live migration of large memory apps, Shribman, Aidan, 2011/08/08

Prev by Date: Re: [Qemu-devel] modelling omap_gpmc with the hierarchical memory API
Next by Date: Re: [Qemu-devel] kvm PCI assignment & VFIO ramblings
Previous by thread: Re: [Qemu-devel] [PATCH v3] XBZRLE delta for live migration of large memory apps
Next by thread: Re: [Qemu-devel] [PATCH v3] XBZRLE delta for live migration of large memory apps
Index(es):
- Date
- Thread