qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH v3] XBZRLE delta for live migration of large mem


From: Stefan Hajnoczi
Subject: Re: [Qemu-devel] [PATCH v3] XBZRLE delta for live migration of large memory apps
Date: Tue, 2 Aug 2011 19:05:54 +0100
User-agent: Mutt/1.5.21 (2010-09-15)

On Tue, Aug 02, 2011 at 03:45:56PM +0200, Shribman, Aidan wrote:
> Subject: [PATCH v3] XBZRLE delta for live migration of large memory apps
> From: Aidan Shribman <address@hidden>
> 
> By using XBZRLE (Xor Binary Zero Run-Length-Encoding) we can reduce VM 
> downtime
> and total live-migration time for VMs running memory write intensive workloads
> typical of large enterprise applications such as SAP ERP Systems, and 
> generally
> speaking for representative of any application with a sparse memory update 
> pattern.
> 
> On the sender side XBZRLE is used as a compact delta encoding of page updates,
> retrieving the old page content from an LRU cache (default size of 64 MB). The
> receiving side uses the existing page content and XBZRLE to decode the new 
> page
> content.
> 
> Work was originally based on research results published VEE 2011: Evaluation 
> of
> Delta Compression Techniques for Efficient Live Migration of Large Virtual
> Machines by Benoit, Svard, Tordsson and Elmroth. Additionally the delta 
> encoder
> XBRLE was improved further using XBZRLE instead.
> 
> XBZRLE has a sustained bandwidth of 1.5-2.2 GB/s for typical workloads making 
> it
> ideal for in-line, real-time encoding such as is needed for live-migration.

What is the CPU cost of xbzrle live migration on the source host?  I'm
thinking about a graph showing CPU utilization (e.g. from mpstat(1))
that has two datasets: migration without xbzrle and migration with
xbzrle.

> @@ -128,28 +288,35 @@ static int ram_save_block(QEMUFile *f)
>                                              current_addr + TARGET_PAGE_SIZE,
>                                              MIGRATION_DIRTY_FLAG);
> 
> -            p = block->host + offset;
> +            if (arch_mig_state.use_xbrle) {
> +                p = qemu_mallocz(TARGET_PAGE_SIZE);

qemu_malloc()

> +static uint8_t count_hash_bits(uint64_t v)
> +{
> +    uint8_t bits = 0;
> +
> +    while (!(v & 1)) {
> +        v = v >> 1;
> +        bits++;
> +    }
> +    return bits;
> +}

See ffs(3).  ffsll() does what you need.

> +static uint8_t xor_buf[TARGET_PAGE_SIZE];
> +static uint8_t xbzrle_buf[TARGET_PAGE_SIZE * 2];

Do these need to be static globals?  It should be fine to define them as
local variables inside the functions that need them, there is enough
stack space.

> +
> +int xbzrle_encode(uint8_t *xbzrle, const uint8_t *old, const uint8_t *curr,
> +    const size_t max_compressed_len)
> +{
> +    int compressed_len;
> +
> +    xor_encode_word(xor_buf, old, curr);
> +    compressed_len = rle_encode((uint64_t *)xor_buf,
> +        sizeof(xor_buf)/sizeof(uint64_t), xbzrle_buf,
> +        sizeof(xbzrle_buf));
> +    if (compressed_len > max_compressed_len) {
> +        return -1;
> +    }
> +    memcpy(xbzrle, xbzrle_buf, compressed_len);

Why the intermediate xbrzle_buf buffer and why the memcpy()?

return rle_encode((uint64_t *)xor_buf, sizeof(xor_buf) / sizeof(uint64_t),
                  xbzrle, max_compressed_len);

Stefan



reply via email to

[Prev in Thread] Current Thread [Next in Thread]