[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH] Optimise memset on i386

From: Christian Franke
Subject: Re: [PATCH] Optimise memset on i386
Date: Fri, 23 Jul 2010 19:34:51 +0200
User-agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv: Gecko/20100504 SeaMonkey/2.0.5

richardvoigt wrote:

might I suggest:

unsigned long patternl = pattern8;
patternl |= patternl << 8;
patternl |= patternl << 16;
patternl |= patternl << 32;
patternl |= patternl << 64;

O(lg N) instead of O(N), no loop, no branches, and the compiler should be smart enough to optimize away the last two lines on systems with narrower long.

The latter is unfortunately not the case. At least gcc 4.5.0 prints a warning but still produces code.

$ cat <<EOF >f.c
unsigned long f(unsigned long x)
  x |= x << 32;
  x |= x << 64;
  return x;

$ gcc -O3 -S f.c
x.c: In function ‘f’:
x.c:3: warning: left shift count >= width of type
x.c:4: warning: left shift count >= width of type

$ cat f.s
        pushl   %ebp
        movl    $32, %ecx
        movl    %esp, %ebp
        movl    8(%ebp), %eax
        popl    %ebp
        movl    %eax, %edx
        sall    %cl, %edx
        movl    $64, %ecx
        orl     %eax, %edx
        movl    %edx, %eax
        sall    %cl, %eax
        orl     %edx, %eax

Christian Franke

reply via email to

[Prev in Thread] Current Thread [Next in Thread]