[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] [PATCHv2 3/9] buffer_is_zero: use vector optimizations
From: |
Eric Blake |
Subject: |
Re: [Qemu-devel] [PATCHv2 3/9] buffer_is_zero: use vector optimizations if possible |
Date: |
Tue, 19 Mar 2013 10:08:03 -0600 |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:17.0) Gecko/20130311 Thunderbird/17.0.4 |
On 03/15/2013 09:50 AM, Peter Lieven wrote:
> performance gain on SSE2 is approx. 20-25%. altivec
> is not tested. performance for unsigned long arithmetic
> is unchanged.
>
> Signed-off-by: Peter Lieven <address@hidden>
> ---
> util/cutils.c | 7 +++++++
> 1 file changed, 7 insertions(+)
>
> diff --git a/util/cutils.c b/util/cutils.c
> index 857dd7d..00d98fb 100644
> --- a/util/cutils.c
> +++ b/util/cutils.c
> @@ -190,6 +190,13 @@ size_t buffer_find_nonzero_offset(const void *buf,
> size_t len)
> */
> bool buffer_is_zero(const void *buf, size_t len)
> {
> + /* use vector optimized zero check if possible */
> + if (((uintptr_t) buf) % sizeof(VECTYPE) == 0
> + && len % (BUFFER_FIND_NONZERO_OFFSET_UNROLL_FACTOR
> + * sizeof(VECTYPE)) == 0) {
Is it worth factoring this check into something more reusable, by adding
something like 'bool buffer_can_use_vectors(buf, len)' in patch 2/9?
> + return buffer_find_nonzero_offset(buf, len)==len;
Spaces around binary operators.
Is it worth rewriting this function into a simpler:
check up to (BUFFER_FIND_NONZERO_OFFSET_UNROLL_FACTOR - 1) longs until
we are aligned
check buffer_find_nonzero_offset on the aligned middle
check up to (BUFFER_FIND_NONZERO_OFFSET_UNROLL_FACTOR - 1) longs at tail
instead of having two instances of code that can loop over the entire
buffer? Otherwise, searching for zeros on an unaligned buffer will
remain slower, even though the bulk of the search could still benefit
from the vector operations.
> + }
> +
> /*
> * Use long as the biggest available internal data type that fits into
> the
> * CPU register and unroll the loop to smooth out the effect of memory
Your patch results in C99 declarations after statements; while we
require C99, I'm not sure if qemu prefers to stick to the C89 style of
declarations before statements.
--
Eric Blake eblake redhat com +1-919-301-3266
Libvirt virtualization library http://libvirt.org
signature.asc
Description: OpenPGP digital signature
- [Qemu-devel] [PATCHv2 0/9] buffer_is_zero / migration optimizations, Peter Lieven, 2013/03/15
- [Qemu-devel] [PATCHv2 9/9] migration: use XBZRLE only after bulk stage, Peter Lieven, 2013/03/15
- [Qemu-devel] [PATCHv2 8/9] migration: do not search dirty pages in bulk stage, Peter Lieven, 2013/03/15
- [Qemu-devel] [PATCHv2 4/9] bitops: use vector algorithm to optimize find_next_bit(), Peter Lieven, 2013/03/15
- [Qemu-devel] [PATCHv2 3/9] buffer_is_zero: use vector optimizations if possible, Peter Lieven, 2013/03/15
- Re: [Qemu-devel] [PATCHv2 3/9] buffer_is_zero: use vector optimizations if possible,
Eric Blake <=
- [Qemu-devel] [PATCHv2 2/9] cutils: add a function to find non-zero content in a buffer, Peter Lieven, 2013/03/15
[Qemu-devel] [PATCHv2 7/9] migration: do not sent zero pages in bulk stage, Peter Lieven, 2013/03/15