freetype-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [ft-devel] FT_MulFix assembly


From: Miles Bader
Subject: Re: [ft-devel] FT_MulFix assembly
Date: Mon, 06 Sep 2010 17:02:38 +0900

James Cloos <address@hidden> writes:
>     __asm__ __volatile__ (
>       "movq  %1, %%rax\n"
>       "imul  %2\n"
>       "addq  %%rdx, %%rax\n"
>       "addq  $0x8000, %%rax\n"
>       "sarq  $16, %%rax\n"
>       : "=a"(result)
>       : "g"(a), "g"(b)
>       : "rdx" );
>
> The above code has a latency of 1+5+1+1+1 = 10 clocks on an amdfam10 cpu.
...
> Is the amd64 version desired, given how little benefit it has?

If this is being used in a context where it might benefit from more
scheduling, etc, perhaps it would help to let the compiler generate the
non-imul insns (since it's pretty good at those)?

E.g. something like:

  static __inline__ long
  FT_MulFix_x86_64 (long a, long b)
  {
    register long  mr1, mr2;
    __asm__ ("imul  %3\n" : "=a" (mr1), "=d" (mr2) : "a" (a), "g" (b));
    return ((mr1 + mr2) + 0x8000) >> 16;
  }


-miles

-- 
Omochiroi!




reply via email to

[Prev in Thread] Current Thread [Next in Thread]