[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[PATCH v2 38/69] target/arm: Handle FPCR.AH in negation steps in FCADD
From: |
Peter Maydell |
Subject: |
[PATCH v2 38/69] target/arm: Handle FPCR.AH in negation steps in FCADD |
Date: |
Sat, 1 Feb 2025 16:39:41 +0000 |
The negation steps in FCADD must honour FPCR.AH's "don't change the
sign of a NaN" semantics. Implement this by encoding FPCR.AH into
the SIMD data field passed to the helper and using that to decide
whether to negate the values.
The construction of neg_imag and neg_real were done to make it easy
to apply both in parallel with two simple logical operations. This
changed with FPCR.AH, which is more complex than that. Switch to
an approach closer to the pseudocode, where we extract the rot
parameter from the SIMD data word and negate the appropriate
input value.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
---
target/arm/tcg/translate-a64.c | 10 +++++--
target/arm/tcg/vec_helper.c | 54 +++++++++++++++++++---------------
2 files changed, 38 insertions(+), 26 deletions(-)
diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c
index 0c1e97e6c40..52f93cb905b 100644
--- a/target/arm/tcg/translate-a64.c
+++ b/target/arm/tcg/translate-a64.c
@@ -6117,8 +6117,14 @@ static gen_helper_gvec_3_ptr * const f_vector_fcadd[3] =
{
gen_helper_gvec_fcadds,
gen_helper_gvec_fcaddd,
};
-TRANS_FEAT(FCADD_90, aa64_fcma, do_fp3_vector, a, 0, f_vector_fcadd)
-TRANS_FEAT(FCADD_270, aa64_fcma, do_fp3_vector, a, 1, f_vector_fcadd)
+/*
+ * Encode FPCR.AH into the data so the helper knows whether the
+ * negations it does should avoid flipping the sign bit on a NaN
+ */
+TRANS_FEAT(FCADD_90, aa64_fcma, do_fp3_vector, a, 0 | (s->fpcr_ah << 1),
+ f_vector_fcadd)
+TRANS_FEAT(FCADD_270, aa64_fcma, do_fp3_vector, a, 1 | (s->fpcr_ah << 1),
+ f_vector_fcadd)
static bool trans_FCMLA_v(DisasContext *s, arg_FCMLA_v *a)
{
diff --git a/target/arm/tcg/vec_helper.c b/target/arm/tcg/vec_helper.c
index 0b84a562c03..b181b9734d4 100644
--- a/target/arm/tcg/vec_helper.c
+++ b/target/arm/tcg/vec_helper.c
@@ -879,19 +879,21 @@ void HELPER(gvec_fcaddh)(void *vd, void *vn, void *vm,
float16 *d = vd;
float16 *n = vn;
float16 *m = vm;
- uint32_t neg_real = extract32(desc, SIMD_DATA_SHIFT, 1);
- uint32_t neg_imag = neg_real ^ 1;
+ bool rot = extract32(desc, SIMD_DATA_SHIFT, 1);
+ bool fpcr_ah = extract64(desc, SIMD_DATA_SHIFT + 1, 1);
uintptr_t i;
- /* Shift boolean to the sign bit so we can xor to negate. */
- neg_real <<= 15;
- neg_imag <<= 15;
-
for (i = 0; i < opr_sz / 2; i += 2) {
float16 e0 = n[H2(i)];
- float16 e1 = m[H2(i + 1)] ^ neg_imag;
+ float16 e1 = m[H2(i + 1)];
float16 e2 = n[H2(i + 1)];
- float16 e3 = m[H2(i)] ^ neg_real;
+ float16 e3 = m[H2(i)];
+
+ if (rot) {
+ e3 = float16_maybe_ah_chs(e3, fpcr_ah);
+ } else {
+ e1 = float16_maybe_ah_chs(e1, fpcr_ah);
+ }
d[H2(i)] = float16_add(e0, e1, fpst);
d[H2(i + 1)] = float16_add(e2, e3, fpst);
@@ -906,19 +908,21 @@ void HELPER(gvec_fcadds)(void *vd, void *vn, void *vm,
float32 *d = vd;
float32 *n = vn;
float32 *m = vm;
- uint32_t neg_real = extract32(desc, SIMD_DATA_SHIFT, 1);
- uint32_t neg_imag = neg_real ^ 1;
+ bool rot = extract32(desc, SIMD_DATA_SHIFT, 1);
+ bool fpcr_ah = extract64(desc, SIMD_DATA_SHIFT + 1, 1);
uintptr_t i;
- /* Shift boolean to the sign bit so we can xor to negate. */
- neg_real <<= 31;
- neg_imag <<= 31;
-
for (i = 0; i < opr_sz / 4; i += 2) {
float32 e0 = n[H4(i)];
- float32 e1 = m[H4(i + 1)] ^ neg_imag;
+ float32 e1 = m[H4(i + 1)];
float32 e2 = n[H4(i + 1)];
- float32 e3 = m[H4(i)] ^ neg_real;
+ float32 e3 = m[H4(i)];
+
+ if (rot) {
+ e3 = float32_maybe_ah_chs(e3, fpcr_ah);
+ } else {
+ e1 = float32_maybe_ah_chs(e1, fpcr_ah);
+ }
d[H4(i)] = float32_add(e0, e1, fpst);
d[H4(i + 1)] = float32_add(e2, e3, fpst);
@@ -933,19 +937,21 @@ void HELPER(gvec_fcaddd)(void *vd, void *vn, void *vm,
float64 *d = vd;
float64 *n = vn;
float64 *m = vm;
- uint64_t neg_real = extract64(desc, SIMD_DATA_SHIFT, 1);
- uint64_t neg_imag = neg_real ^ 1;
+ bool rot = extract32(desc, SIMD_DATA_SHIFT, 1);
+ bool fpcr_ah = extract64(desc, SIMD_DATA_SHIFT + 1, 1);
uintptr_t i;
- /* Shift boolean to the sign bit so we can xor to negate. */
- neg_real <<= 63;
- neg_imag <<= 63;
-
for (i = 0; i < opr_sz / 8; i += 2) {
float64 e0 = n[i];
- float64 e1 = m[i + 1] ^ neg_imag;
+ float64 e1 = m[i + 1];
float64 e2 = n[i + 1];
- float64 e3 = m[i] ^ neg_real;
+ float64 e3 = m[i];
+
+ if (rot) {
+ e3 = float64_maybe_ah_chs(e3, fpcr_ah);
+ } else {
+ e1 = float64_maybe_ah_chs(e1, fpcr_ah);
+ }
d[i] = float64_add(e0, e1, fpst);
d[i + 1] = float64_add(e2, e3, fpst);
--
2.34.1
- [PATCH v2 29/69] target/arm: Implement FPCR.AH semantics for SVE FMIN/FMAX immediate, (continued)
- [PATCH v2 29/69] target/arm: Implement FPCR.AH semantics for SVE FMIN/FMAX immediate, Peter Maydell, 2025/02/01
- [PATCH v2 31/69] target/arm: Implement FPCR.AH handling of negation of NaN, Peter Maydell, 2025/02/01
- [PATCH v2 32/69] target/arm: Implement FPCR.AH handling for scalar FABS and FABD, Peter Maydell, 2025/02/01
- [PATCH v2 33/69] target/arm: Handle FPCR.AH in vector FABD, Peter Maydell, 2025/02/01
- [PATCH v2 34/69] target/arm: Handle FPCR.AH in SVE FNEG, Peter Maydell, 2025/02/01
- [PATCH v2 35/69] target/arm: Handle FPCR.AH in SVE FABS, Peter Maydell, 2025/02/01
- [PATCH v2 36/69] target/arm: Handle FPCR.AH in SVE FABD, Peter Maydell, 2025/02/01
- [PATCH v2 37/69] target/arm: Handle FPCR.AH in negation steps in SVE FCADD, Peter Maydell, 2025/02/01
- [PATCH v2 39/69] target/arm: Handle FPCR.AH in FRECPS and FRSQRTS scalar insns, Peter Maydell, 2025/02/01
- [PATCH v2 40/69] target/arm: Handle FPCR.AH in FRECPS and FRSQRTS vector insns, Peter Maydell, 2025/02/01
- [PATCH v2 38/69] target/arm: Handle FPCR.AH in negation steps in FCADD,
Peter Maydell <=
- [PATCH v2 41/69] target/arm: Handle FPCR.AH in negation step in FMLS (indexed), Peter Maydell, 2025/02/01
- [PATCH v2 42/69] target/arm: Handle FPCR.AH in negation in FMLS (vector), Peter Maydell, 2025/02/01
- [PATCH v2 43/69] target/arm: Handle FPCR.AH in negation step in SVE FMLS (vector), Peter Maydell, 2025/02/01
- [PATCH v2 44/69] target/arm: Handle FPCR.AH in SVE FTSSEL, Peter Maydell, 2025/02/01
- [PATCH v2 47/69] target/arm: Handle FPCR.AH in FCMLA by index, Peter Maydell, 2025/02/01
- [PATCH v2 48/69] target/arm: Handle FPCR.AH in SVE FCMLA, Peter Maydell, 2025/02/01
- [PATCH v2 45/69] target/arm: Handle FPCR.AH in SVE FTMAD, Peter Maydell, 2025/02/01
- [PATCH v2 46/69] target/arm: Handle FPCR.AH in vector FCMLA, Peter Maydell, 2025/02/01
- [PATCH v2 49/69] target/arm: Handle FPCR.AH in FMLSL (by element and vector), Peter Maydell, 2025/02/01
- [PATCH v2 52/69] target/arm: Enable FEAT_AFP for '-cpu max', Peter Maydell, 2025/02/01