grub-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH] Optimise memset on i386


From: Colin Watson
Subject: Re: [PATCH] Optimise memset on i386
Date: Sat, 24 Jul 2010 23:40:31 +0100
User-agent: Mutt/1.5.18 (2008-05-17)

On Fri, Jul 23, 2010 at 10:56:24AM -0500, address@hidden wrote:
> [snip]
> 
> > +      unsigned long patternl = 0;
> > +      grub_size_t i;
> > +
> > +      for (i = 0; i < sizeof (unsigned long); i++)
> > +       patternl |= ((unsigned long) pattern8) << (8 * i);
> > +
> 
> might I suggest:
> 
> unsigned long patternl = pattern8;
> patternl |= patternl << 8;
> patternl |= patternl << 16;
> patternl |= patternl << 32;
> patternl |= patternl << 64;
> 
> O(lg N) instead of O(N), no loop, no branches, and the compiler should be
> smart enough to optimize away the last two lines on systems with narrower
> long.

I no longer have the system on which I benchmarked this.  However, since
N is always either 4 or 8 on current targets, this can only amount to
micro-optimisation which I don't think can possibly matter much; we're
talking a handful of cycles at most.  Do we really need to spend time
bikeshedding this?  The important thing is taking only a cache stall per
long rather than a cache stall per byte; anything else is likely to be
noise.

-- 
Colin Watson                                       address@hidden



reply via email to

[Prev in Thread] Current Thread [Next in Thread]