[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] [PATCHv2 4/9] bitops: use vector algorithm to optimize
From: |
Eric Blake |
Subject: |
Re: [Qemu-devel] [PATCHv2 4/9] bitops: use vector algorithm to optimize find_next_bit() |
Date: |
Tue, 19 Mar 2013 10:49:25 -0600 |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:17.0) Gecko/20130311 Thunderbird/17.0.4 |
On 03/15/2013 09:50 AM, Peter Lieven wrote:
> this patch adds the usage of buffer_find_nonzero_offset()
> to skip large areas of zeroes.
>
> compared to loop unrolling presented in an earlier
> patch this adds another 50% performance benefit for
> skipping large areas of zeroes. loop unrolling alone
> added close to 100% speedup.
>
> Signed-off-by: Peter Lieven <address@hidden>
> ---
> util/bitops.c | 26 +++++++++++++++++++++++---
> 1 file changed, 23 insertions(+), 3 deletions(-)
> + while (size >= BITS_PER_LONG) {
> + if ((tmp = *p)) {
> + goto found_middle;
> + }
> + if (((uintptr_t) p) % sizeof(VECTYPE) == 0
> + && size >= BITS_PER_BYTE * sizeof(VECTYPE)
> + * BUFFER_FIND_NONZERO_OFFSET_UNROLL_FACTOR) {
Another instance where a helper function to check for alignment would be
nice. Except this time you have a BITS_PER_BYTE factor, so you would be
calling something like buffer_can_use_vectors(buf, size / BITS_PER_BYTE)
> + unsigned long tmp2 =
> + buffer_find_nonzero_offset(p, ((size / BITS_PER_BYTE) &
> + ~(BUFFER_FIND_NONZERO_OFFSET_UNROLL_FACTOR *
> + sizeof(VECTYPE) - 1)));
Type mismatch - buffer_find_nonzero_offset returns size_t, which isn't
necessarily the same size as unsigned long. I'm not sure if it can bite
you.
> + result += tmp2 * BITS_PER_BYTE;
> + size -= tmp2 * BITS_PER_BYTE;
> + p += tmp2 / sizeof(unsigned long);
> + if (!size) {
> + return result;
> + }
> + if (tmp2) {
Do you really need this condition, or would it suffice to just
'continue;' the loop? Once buffer_find_nonzero_offset returns anything
that leaves size as non-zero, we are guaranteed that the loop will goto
found_middle without any further calls to buffer_find_nonzero_offset.
> + if ((tmp = *p)) {
> + goto found_middle;
> + }
> + }
> }
> + p++;
> result += BITS_PER_LONG;
> size -= BITS_PER_LONG;
> }
>
--
Eric Blake eblake redhat com +1-919-301-3266
Libvirt virtualization library http://libvirt.org
signature.asc
Description: OpenPGP digital signature
- [Qemu-devel] [PATCHv2 0/9] buffer_is_zero / migration optimizations, Peter Lieven, 2013/03/15
- [Qemu-devel] [PATCHv2 9/9] migration: use XBZRLE only after bulk stage, Peter Lieven, 2013/03/15
- [Qemu-devel] [PATCHv2 8/9] migration: do not search dirty pages in bulk stage, Peter Lieven, 2013/03/15
- [Qemu-devel] [PATCHv2 4/9] bitops: use vector algorithm to optimize find_next_bit(), Peter Lieven, 2013/03/15
- Re: [Qemu-devel] [PATCHv2 4/9] bitops: use vector algorithm to optimize find_next_bit(),
Eric Blake <=
- [Qemu-devel] [PATCHv2 3/9] buffer_is_zero: use vector optimizations if possible, Peter Lieven, 2013/03/15
- [Qemu-devel] [PATCHv2 2/9] cutils: add a function to find non-zero content in a buffer, Peter Lieven, 2013/03/15