[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] [PATCH 7/7] target-mips: Add IEEE 754-2008 features sup
From: |
Maciej W. Rozycki |
Subject: |
Re: [Qemu-devel] [PATCH 7/7] target-mips: Add IEEE 754-2008 features support |
Date: |
Tue, 10 Feb 2015 14:30:43 +0000 (GMT) |
User-agent: |
Alpine 2.11 (LFD 23 2013-08-11) |
On Tue, 10 Feb 2015, Leon Alrae wrote:
> > These cases could be addressed by either replacing subtraction from 0.0
> > with multiplication by -1.0, or by tweaking the rounding mode as needed
> > temporarily. Given that the computational cost of multiplication is
> > uncertain and likely higher or at best the same as the cost of addition or
> > subtraction, I'd be leaning towards the latter solution.
>
> My first thought was to treat zero in NEG.fmt as a special case and use
> float32_chs() for it. But tweaking the rounding mode temporarily
> probably is better as we will get consistent behaviour for zero as well
> as input denormals which are squashed in float32_sub() when
> flush_inputs_to_zero flag is set (actually I'm not sure if legacy fp
> instructions should flush input denormals, but according to the spec
> this is implementation dependent so I won't worry about this).
As expected setting CP1.FCSR.FS on a randomly picked R4400 processor:
CPU0 revision is: 00000440 (R4400SC)
FPU revision is: 00000500
does flush a NEG.fmt's input denormal to 0. Given this program:
#include <stdint.h>
#include <stdio.h>
int main(void)
{
union {
double d;
uint64_t i;
} x = { .i = 0x000123456789abcdULL }, y, z;
unsigned long tmp, fcsr;
printf("x: %016lx\n", x.i);
asm volatile(
" cfc1 %1, $31\n"
" or %2, %1, %4\n"
" ctc1 %2, $31\n"
" neg.d %0, %3\n"
" ctc1 %1, $31"
: "=f" (y.d), "=&r" (fcsr), "=&r" (tmp)
: "f" (x.d), "r" (1 << 24));
printf("y: %016lx\n", y.i);
asm volatile(
" neg.d %0, %1"
: "=f" (z.d) : "f" (x.d));
printf("z: %016lx\n", z.i);
x.i = 0;
printf("+: %016lx\n", x.i);
asm volatile(
" neg.d %0, %1"
: "=f" (y.d) : "f" (x.d));
printf("-: %016lx\n", y.i);
return 0;
}
I get this output:
x: 000123456789abcd
y: 8000000000000000
z: 800123456789abcd
+: 0000000000000000
-: 8000000000000000
under Linux. According to R4400 documentation the value of `z' must have
been calculated by the in-kernel emulator in the Unimplemented Operation
handler as for this processor implementation any denormalised operands
cause this exception except for compare instructions. But in any case all
the results are consistent. So we don't actually have to do anything for
the flush-to-zero mode, our calculation should work out as expected (as
long as the `float_round_down' rounding mode is respected that is).
While at it I included the result of the negation of 0 for completeness.
Maciej