On 6/3/2014 8:43 AM, Tristan Gingold wrote:
On 03 Jun 2014, at 12:02, Alexander Graf <address@hidden> wrote:
On 06/03/2014 11:14 AM, Tristan Gingold wrote:
Remove the code that reduce the result to float32 as the frsqrte
instruction is defined to return a double-precision estimate of
the reciprocal square root.
Although reducing the fractional part is harmless (as the estimation
must have at least 12 bits of precision according to the old PEM),
reducing the exponent range is not correct.
Signed-off-by: Tristan Gingold <address@hidden>
I couldn't find a reference to doubles in ISA 2.07. Is frsqrte supposed to
return doubles on all cores?
I have just checked ISA V 2.06 (will download 2.07 if necessary). There are
now two
instructions: frsqrte and frsqrtes. The second one if for single - so the
first one is for double.
[ If you look at IBM AIX assembly manual:
http://publib.boulder.ibm.com/infocenter/pseries/v5r3/index.jsp?topic=/com.ibm.aix.aixassem/doc/alangref/frsqrte.htm
they clearly mention that frsqrte operates on double on 603, 604 but not
implemented on 601]
Or is this implementation specific?
This instruction is optional and the precision of the estimation is
implementation dependant.
I have looked at some implementation manuals (604, 603, e300) and they don't
mention single.
Also, is frsqrte the only instruction affected?
Yes. Operation fres operates on single.
Tristan.
Alex
I concur with the assessement-- frsqrte (reciprocal square root estimate) is an
optional, double-precision instruction
and has always been double precision. And, there is, indeed, a single
precision version (frsqrtes) that came along later
(sometime between 1998 and 2006, and thus later than the 603/604).
The instruction is, as the name implies, an *estimate*. I suspect the code
that rounds to single precision and then
back to double was a clever way to truncate the mantissa and thus look more
like the result produced by some PPC
implementation (hardware). Unfortunately, it doesn't work if the original
double-precision result is not representable
in single precision range, as Tristan likely discovered :)
And to be clear helper_frsrte is used for both frsqtre and frsqrtes. But the
code in translate.c (gen_frsqrtes) handles
the "singleness" of that instruction.
The patch looks correct to me.
Reviewed-by: Tom Musta <address@hidden>