[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [PATCH] count-leading-zeros: new module
From: |
Ondřej Bílka |
Subject: |
Re: [PATCH] count-leading-zeros: new module |
Date: |
Sat, 11 Aug 2012 12:23:56 +0200 |
User-agent: |
Mutt/1.5.20 (2009-06-14) |
On Sat, Aug 11, 2012 at 09:45:16AM +0200, Jim Meyering wrote:
> Eric Blake wrote:
> > I needed gcc's clz to determine the most significant bit of a
> > number (useful for things like truncating to a power of 2),
> > and was surprised it is not a standardized function (the
> > opposite direction of finding the least significant bit is
> ...
> > +/* Expand the code which computes the number of leading zeros of the local
> > + variable 'x' of type TYPE (an unsigned integer type) and returns it
> > + from the current function. */
> > +#if 0 && __GNUC__ > 3 || (__GNUC__ == 3 && __GNUC_MINOR__ >= 4)
> > +# define COUNT_LEADING_ZEROS(BUILTIN, TYPE) \
> > + return x ? BUILTIN (x) : CHAR_BIT * sizeof x;
> > +#else
> > +# define COUNT_LEADING_ZEROS(BUILTIN, TYPE) \
> > + /* This condition is written so as to avoid shifting by more than \
> > + 31 bits at once, and also avoids a random HP-UX cc bug. */ \
> > + verify (((TYPE) -1 >> 31 >> 31 >> 2) == 0); /* TYPE has at most 64 bits
> > */ \
> > + int count = 0; \
> > + if (1 < (TYPE) -1 >> 31) { /* TYPE has more than 32 bits? */ \
> > + count = count_leading_zeros_32 (x >> 31 >> 1); \
> > + if (count < 32) \
> > + return count; \
> > + } \
> > + return count + count_leading_zeros_32 (x);
> > +
> > +/* Compute and return the number of leading zeros in the least
> > + significant 32 bits of X. */
> > +static inline int
> > +count_leading_zeros_32 (unsigned int x)
> > +{
> > + int count = 0;
> > + if (x & 0xffff0000U)
> > + x >>= 16;
> > + else
> > + count += 16;
> > + if (x & 0xff00)
> > + x >>= 8;
> > + else
> > + count += 8;
> > + if (x & 0xf0)
> > + x >>= 4;
> > + else
> > + count += 4;
> > + if (x & 0xc)
> > + x >>= 2;
> > + else
> > + count += 2;
> > + if (x & 2)
> > + x >>= 1;
> > + else
> > + count++;
> > + if (!(x & 1))
> > + count++;
> > + return count;
> > +}
> > +#endif
>
> Hi Eric,
>
> Did you consider using a variant of the following?
> It looks like it would be more efficient.
>
> [From http://graphics.stanford.edu/~seander/bithacks.html]
>
> Find the log base 2 of an N-bit integer in O(lg(N)) operations
> with multiply and lookup
>
> uint32_t v; // find the log base 2 of 32-bit v
> int r; // result goes here
>
> static const int MultiplyDeBruijnBitPosition[32] =
> {
> 0, 9, 1, 10, 13, 21, 2, 29, 11, 14, 16, 18, 22, 25, 3, 30,
> 8, 12, 20, 28, 15, 17, 24, 7, 19, 27, 23, 6, 26, 5, 4, 31
> };
>
> v |= v >> 1; // first round down[sic*] to one less than a power of 2
> v |= v >> 2;
> v |= v >> 4;
> v |= v >> 8;
> v |= v >> 16;
>
> r = MultiplyDeBruijnBitPosition[(uint32_t)(v * 0x07C4ACDDU) >> 27];
>
> The code above computes the log base 2 of a 32-bit integer with a
> small table lookup and multiply. It requires only 13 operations,
> compared to (up to) 20 for the previous method. The purely
> table-based method requires the fewest operations, but this offers
> a reasonable compromise between table size and speed.
>
> [*] I've just reported the error in that comment: s/down/up/
Hi,
Currently fastest way is convert to float and extract exponent. This is
always log2(n) or log2(n)+1 which can be easily repaired.
Here is implementation:
#include <stdint.h>
inline int clz64(int64_t x){
if (x<0) return 0;
union convert {uint64_t i;double f;} un;
un.f=(double) x;
int64_t ret= (un.i >> (64-12)) - 1023;
return 63-(ret - 1 + (x>>ret));
}
inline int clz32(int32_t x){
if (x<0) return 0;
union convert {uint32_t i;float f;} un;
un.f=(float) x;
int ret=(un.i >> (32-9)) - 127;
return 31-(ret - 1 + (x>>ret));
}
inline int clz32_alt(uint32_t x){
union convert {uint64_t i;double f;} un;
un.f=(double) x;
int ret =(un.i >> (64-12)) - 1023;
return 31 - ret;
}
gcc messes this code a bit, it generates following code:
cvtsi2sdq %rdi, %xmm0
movsd %xmm0, -8(%rsp)
movq -8(%rsp), %rcx
--
the printer thinks its a router.