[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [ft-devel] FT_MulFix assembly
From: |
James Cloos |
Subject: |
Re: [ft-devel] FT_MulFix assembly |
Date: |
Mon, 06 Sep 2010 15:28:57 -0400 |
User-agent: |
Gnus/5.110011 (No Gnus v0.11) Emacs/24.0.50 (gnu/linux) |
>>>>> "MB" == Miles Bader <address@hidden> writes:
MB> The compiler generates the following assembly:
MB> mov %esi, %eax
MB> mov %edi, %edi
MB> imulq %rdi, %rax
MB> addq $32768, %rax
MB> shrq $16, %rax
That does not match the C code though; it rounds negative values wrong.
The C version does away-from-zero rounding.
Using the single arg version of imulq generates a 128 bit result; the
more significant part of which will be 0 iff the product is >=0 and
will be -1 if the product is <0, given that the multiplicands were
only 32 bits. Adding that, in addition to the 32768, to rax ensures
that the result of the >>=16 is rounded the way freetype wants.
If you use the two arg version of imul, you have to copy the msb of the
result (or do a compare and jump, like the C code) to determine whether
to add 0x8000 or 0x7FFF.
Matching the rounding was the hardest part; noting that the upper 64
bits of the 128-bit product would always be just sign-extension bits
and that, because of the prototype of FT_MulFix() itself, the vaules
are already promoted to 64 bits before they get to the assembly were
what provided the most (in-order) speedups.
If it can be done better, though, I'd be happy to know!
Thanks for also looking at it.
-JimC
--
James Cloos <address@hidden> OpenPGP: 1024D/ED7DAEA6
- Re: [ft-devel] FT_MulFix assembly, James Cloos, 2010/09/05
- Re: [ft-devel] FT_MulFix assembly, Graham Asher, 2010/09/06
- Re: [ft-devel] FT_MulFix assembly, Miles Bader, 2010/09/06
- Re: [ft-devel] FT_MulFix assembly, Miles Bader, 2010/09/06
- Re: [ft-devel] FT_MulFix assembly, Miles Bader, 2010/09/06
- Re: [ft-devel] FT_MulFix assembly,
James Cloos <=
- Re: [ft-devel] FT_MulFix assembly, Miles Bader, 2010/09/06
- Re: [ft-devel] FT_MulFix assembly, James Cloos, 2010/09/07
- Re: [ft-devel] FT_MulFix assembly, Miles Bader, 2010/09/07
- Re: [ft-devel] FT_MulFix assembly, James Cloos, 2010/09/07
- Re: [ft-devel] FT_MulFix assembly, Miles Bader, 2010/09/07
- Re: [ft-devel] FT_MulFix assembly, James Cloos, 2010/09/18
- Re: [ft-devel] FT_MulFix assembly, Werner LEMBERG, 2010/09/19