qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCHv4 0/9] buffer_is_zero / migration optimizations


From: Paolo Bonzini
Subject: Re: [Qemu-devel] [PATCHv4 0/9] buffer_is_zero / migration optimizations
Date: Fri, 22 Mar 2013 18:25:39 +0100
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:17.0) Gecko/20130311 Thunderbird/17.0.4

Il 22/03/2013 13:46, Peter Lieven ha scritto:
> this is v4 of my patch series with various optimizations in
> zero buffer checking and migration tweaks.
> 
> thanks especially to Eric Blake for reviewing.
> 
> v4:
> - do not inline buffer_find_nonzero_offset()
> - inline can_usebuffer_find_nonzero_offset() correctly
> - readd asserts in buffer_find_nonzero_offset() as profiling
>   shows they do not hurt.
> - change last occurences of scalar 8 by 
>   BUFFER_FIND_NONZERO_OFFSET_UNROLL_FACTOR
> - avoid deferencing p already in patch 5 where we
>   know that the page (p) is zero
> - explicitly set bytes_sent = 0 if we skip a zero page.
>   bytes_sent was 0 before, but it was not obvious.
> - add accounting information for skipped zero pages
> - fix errors reported by checkpatch.pl
> 
> v3:
> - remove asserts, inline functions and add a check
>   function if buffer_find_nonzero_offset() can be used.
> - use above check function in buffer_is_zero() and
>   find_next_bit().
> - use buffer_is_nonzero_offset() directly to find
>   zero pages. we know that all requirements are met
>   for memory pages.
> - fix C89 violation in buffer_is_zero().
> - avoid derefencing p in ram_save_block() if we already
>   know the page is zero.
> - fix initialization of last_offset in reset_ram_globals().
> - avoid skipping pages with offset == 0 in bulk stage in
>   migration_bitmap_find_and_reset_dirty().
> - compared to v1 check for zero pages also after bulk
>   ram migration as there are guests (e.g. Windows) which
>   zero out large amount of memory while running.
> 
> v2:
> - fix description, add trivial zero check and add asserts 
>   to buffer_find_nonzero_offset.
> - add a constant for the unroll factor of buffer_find_nonzero_offset
> - replace is_dup_page() by buffer_is_zero()
> - added test results to xbzrle patch
> - optimize descriptions
> 
> Peter Lieven (9):
>   move vector definitions to qemu-common.h
>   cutils: add a function to find non-zero content in a buffer
>   buffer_is_zero: use vector optimizations if possible
>   bitops: use vector algorithm to optimize find_next_bit()
>   migration: search for zero instead of dup pages
>   migration: add an indicator for bulk state of ram migration
>   migration: do not sent zero pages in bulk stage
>   migration: do not search dirty pages in bulk stage
>   migration: use XBZRLE only after bulk stage
> 
>  arch_init.c                   |   74 
> +++++++++++++++++++----------------------
>  hmp.c                         |    2 ++
>  include/migration/migration.h |    2 ++
>  include/qemu-common.h         |   37 +++++++++++++++++++++
>  migration.c                   |    3 +-
>  qapi-schema.json              |    6 ++--
>  qmp-commands.hx               |    3 +-
>  util/bitops.c                 |   24 +++++++++++--
>  util/cutils.c                 |   50 ++++++++++++++++++++++++++++
>  9 files changed, 155 insertions(+), 46 deletions(-)
> 

I think patch 4 is a bit overengineered.  I would prefer the simple
patch you had using three/four non-vectorized accesses.  The setup cost
of the vectorized buffer_is_zero is quite high, and 64 bits are just
256k RAM; if the host doesn't touch 256k RAM, it will incur the overhead.

I would prefer some more benchmarking for patch 5, but it looks ok.

The rest are fine, thanks!

Paolo



reply via email to

[Prev in Thread] Current Thread [Next in Thread]