[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-ppc] [Qemu-devel] [V3 PATCH 12/14] target-ppc: VSX Stage 4: Ad
From: |
Richard Henderson |
Subject: |
Re: [Qemu-ppc] [Qemu-devel] [V3 PATCH 12/14] target-ppc: VSX Stage 4: Add Scalar SP Fused Multiply-Adds |
Date: |
Wed, 04 Dec 2013 13:23:22 +1300 |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:24.0) Gecko/20100101 Thunderbird/24.1.0 |
On 12/04/2013 04:58 AM, Tom Musta wrote:
> This patch adds the Single Precision VSX Scalar Fused Multiply-Add
> instructions: xsmaddasp, xsmaddmsp, xssubasp, xssubmsp, xsnmaddasp,
> xsnmaddmsp, xsnmsubasp, xsnmsubmsp.
>
> The existing VSX_MADD() macro is modified to support rounding of the
> intermediate double precision result to single precision.
>
> V2: Re-implemented per feedback from Richard Henderson. In order to
> avoid double rounding and incorrect results, the operands must be
> converted to true single precision values and use the single precision
> fused multiply/add routine.
>
> V3: Re-implemented per feedback from Richard Henderson. The implementation
> now uses a round-to-odd algorithm to address subtle double rounding errors.
>
> Signed-off-by: Tom Musta <address@hidden>
> ---
> target-ppc/fpu_helper.c | 84 ++++++++++++++++++++++++++++++----------------
> target-ppc/helper.h | 8 ++++
> target-ppc/translate.c | 16 +++++++++
> 3 files changed, 79 insertions(+), 29 deletions(-)
>
> diff --git a/target-ppc/fpu_helper.c b/target-ppc/fpu_helper.c
> index 8825db2..077d057 100644
> --- a/target-ppc/fpu_helper.c
> +++ b/target-ppc/fpu_helper.c
> @@ -2192,7 +2192,7 @@ VSX_TSQRT(xvtsqrtsp, 4, float32, f32, -126, 23)
> * afrm - A form (1=A, 0=M)
> * sfprf - set FPRF
> */
> -#define VSX_MADD(op, nels, tp, fld, maddflgs, afrm, sfprf)
> \
> +#define VSX_MADD(op, nels, tp, fld, maddflgs, afrm, sfprf, r2sp)
> \
> void helper_##op(CPUPPCState *env, uint32_t opcode)
> \
> {
> \
> ppc_vsr_t xt_in, xa, xb, xt_out;
> \
> @@ -2218,8 +2218,18 @@ void helper_##op(CPUPPCState *env, uint32_t opcode)
> \
> for (i = 0; i < nels; i++) {
> \
> float_status tstat = env->fp_status;
> \
> set_float_exception_flags(0, &tstat);
> \
> - xt_out.fld[i] = tp##_muladd(xa.fld[i], b->fld[i], c->fld[i],
> \
> - maddflgs, &tstat);
> \
> + if (r2sp && (tstat.float_rounding_mode == float_round_nearest_even))
> {\
> + /* Avoid double rounding errors by rounding the intermediate */
> \
> + /* result to odd. */
> \
> + set_float_rounding_mode(float_round_to_zero, &tstat);
> \
> + xt_out.fld[i] = tp##_muladd(xa.fld[i], b->fld[i], c->fld[i],
> \
> + maddflgs, &tstat);
> \
> + xt_out.fld[i] |= (get_float_exception_flags(&tstat) &
> \
> + float_flag_inexact) != 0;
> \
> + } else {
> \
> + xt_out.fld[i] = tp##_muladd(xa.fld[i], b->fld[i], c->fld[i],
> \
> + maddflgs, &tstat);
> \
> + }
> \
> env->fp_status.float_exception_flags |= tstat.float_exception_flags;
> \
>
> \
> if (unlikely(tstat.float_exception_flags & float_flag_invalid)) {
> \
> @@ -2242,6 +2252,13 @@ void helper_##op(CPUPPCState *env, uint32_t opcode)
> \
> fload_invalid_op_excp(env, POWERPC_EXCP_FP_VXISI, sfprf);
> \
> }
> \
> }
> \
> +
> \
> + if (r2sp) {
> \
> + float32 tmp32 = float64_to_float32(xt_out.fld[i],
> \
> + &env->fp_status);
> \
> + xt_out.fld[i] = float32_to_float64(tmp32, &env->fp_status);
> \
> + }
> \
> +
> \
helper_frsp
Otherwise,
Reviewed-by: Richard Henderson <address@hidden>
r~
- [Qemu-ppc] [V3 PATCH 03/14] target-ppc: VSX Stage 4: Add lxsiwax, lxsiwzx and lxsspx, (continued)
- [Qemu-ppc] [V3 PATCH 03/14] target-ppc: VSX Stage 4: Add lxsiwax, lxsiwzx and lxsspx, Tom Musta, 2013/12/03
- [Qemu-ppc] [V3 PATCH 04/14] target-ppc: VSX Stage 4: Refactor stxsdx, Tom Musta, 2013/12/03
- [Qemu-ppc] [V3 PATCH 06/14] target-ppc: VSX Stage 4: Add xsaddsp and xssubsp, Tom Musta, 2013/12/03
- [Qemu-ppc] [V3 PATCH 05/14] target-ppc: VSX Stage 4: Add stxsiwx and stxsspx, Tom Musta, 2013/12/03
- [Qemu-ppc] [V3 PATCH 07/14] target-ppc: VSX Stage 4: Add xsmulsp, Tom Musta, 2013/12/03
- [Qemu-ppc] [V3 PATCH 09/14] target-ppc: VSX Stage 4: Add xsresp, Tom Musta, 2013/12/03
- [Qemu-ppc] [V3 PATCH 08/14] target-ppc: VSX Stage 4: Add xsdivsp, Tom Musta, 2013/12/03
- [Qemu-ppc] [V3 PATCH 10/14] target-ppc: VSX Stage 4: Add xssqrtsp, Tom Musta, 2013/12/03
- [Qemu-ppc] [V3 PATCH 11/14] target-ppc: VSX Stage 4: add xsrsqrtesp, Tom Musta, 2013/12/03
- [Qemu-ppc] [V3 PATCH 12/14] target-ppc: VSX Stage 4: Add Scalar SP Fused Multiply-Adds, Tom Musta, 2013/12/03
- Re: [Qemu-ppc] [Qemu-devel] [V3 PATCH 12/14] target-ppc: VSX Stage 4: Add Scalar SP Fused Multiply-Adds,
Richard Henderson <=
- [Qemu-ppc] [V3 PATCH 13/14] target-ppc: VSX Stage 4: Add xscvsxdsp and xscvuxdsp, Tom Musta, 2013/12/03
- [Qemu-ppc] [V3 PATCH 14/14] target-ppc: VSX Stage 4: Add xxleqv, xxlnand and xxlorc, Tom Musta, 2013/12/03
- Re: [Qemu-ppc] [Qemu-devel] [V3 PATCH 00/14] target-ppc: VSX Stage 4, Richard Henderson, 2013/12/03