qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [RFC] find_next_bit optimizations


From: Peter Lieven
Subject: Re: [Qemu-devel] [RFC] find_next_bit optimizations
Date: Mon, 11 Mar 2013 16:37:15 +0100

Am 11.03.2013 um 16:29 schrieb Paolo Bonzini <address@hidden>:

> Il 11/03/2013 16:24, Peter Lieven ha scritto:
>> 
>>> How would that be different in your patch?  But you can solve it by
>>> making two >= loops, one checking for 4*BITS_PER_LONG and one checking
>>> BITS_PER_LONG.
>> 
>> This is what I have now:
>> 
>> diff --git a/util/bitops.c b/util/bitops.c
>> index e72237a..b0dc93f 100644
>> --- a/util/bitops.c
>> +++ b/util/bitops.c
>> @@ -24,12 +24,13 @@ unsigned long find_next_bit(const unsigned long *addr, 
>> unsigned long size,
>>     const unsigned long *p = addr + BITOP_WORD(offset);
>>     unsigned long result = offset & ~(BITS_PER_LONG-1);
>>     unsigned long tmp;
>> +    unsigned long d0,d1,d2,d3;
>> 
>>     if (offset >= size) {
>>         return size;
>>     }
>>     size -= result;
>> -    offset %= BITS_PER_LONG;
>> +    offset &= (BITS_PER_LONG-1);
>>     if (offset) {
>>         tmp = *(p++);
>>         tmp &= (~0UL << offset);
>> @@ -42,7 +43,19 @@ unsigned long find_next_bit(const unsigned long *addr, 
>> unsigned long size,
>>         size -= BITS_PER_LONG;
>>         result += BITS_PER_LONG;
>>     }
>> -    while (size & ~(BITS_PER_LONG-1)) {
>> +    while (size >= 4*BITS_PER_LONG) {
>> +        d0 = *p;
>> +        d1 = *(p+1);
>> +        d2 = *(p+2);
>> +        d3 = *(p+3);
>> +        if (d0 || d1 || d2 || d3) {
>> +            break;
>> +        }
>> +        p+=4;
>> +        result += 4*BITS_PER_LONG;
>> +        size -= 4*BITS_PER_LONG;
>> +    }
>> +    while (size >= BITS_PER_LONG) {
>>         if ((tmp = *(p++))) {
>>             goto found_middle;
>>         }
>> 
> 
> Minus the %= vs. &=,
> 
> Reviewed-by: Paolo Bonzini <address@hidden>
> 
> Perhaps:
> 
>        tmp = *p;
>        d1 = *(p+1);
>        d2 = *(p+2);
>        d3 = *(p+3);
>        if (tmp) {
>            goto found_middle;
>        }
>        if (d1 || d2 || d3) {
>            break;
>        }

i do not know what gcc interally makes of the d0 || d1 || d2 || d3 ?
i would guess its sth like one addition w/ carry and 1 test?

your proposed change would introduce 2 tests (maybe)?

what about this to be sure?

       tmp = *p;
       d1 = *(p+1);
       d2 = *(p+2);
       d3 = *(p+3);
       if (tmp || d1 || d2 || d3) {
           if (tmp) {
               goto found_middle;
           }
           break;
       }

Peter




reply via email to

[Prev in Thread] Current Thread [Next in Thread]