Re: [PATCH v2 2/9] target/ppc: Implemented vector divide instructions

qemu-devel

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v2 2/9] target/ppc: Implemented vector divide instructions

From:	Richard Henderson
Subject:	Re: [PATCH v2 2/9] target/ppc: Implemented vector divide instructions
Date:	Mon, 11 Apr 2022 18:51:09 -0700
User-agent:	Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.7.0

On 4/5/22 12:55, Lucas Mateus Castro(alqotel) wrote:

From: "Lucas Mateus Castro (alqotel)" <lucas.araujo@eldorado.org.br>

Implement the following PowerISA v3.1 instructions:
vdivsw: Vector Divide Signed Word
vdivuw: Vector Divide Unsigned Word
vdivsd: Vector Divide Signed Doubleword
vdivud: Vector Divide Unsigned Doubleword

Signed-off-by: Lucas Mateus Castro (alqotel) <lucas.araujo@eldorado.org.br>
---
  target/ppc/insn32.decode            |  7 ++++
  target/ppc/translate/vmx-impl.c.inc | 59 +++++++++++++++++++++++++++++
  2 files changed, 66 insertions(+)

diff --git a/target/ppc/insn32.decode b/target/ppc/insn32.decode
index ac2d3da9a7..597768558b 100644
--- a/target/ppc/insn32.decode
+++ b/target/ppc/insn32.decode
@@ -703,3 +703,10 @@ XVTLSBB         111100 ... -- 00010 ..... 111011011 . - 
@XX2_bf_xb
  &XL_s           s:uint8_t
  @XL_s           ......-------------- s:1 .......... -   &XL_s
  RFEBB           010011-------------- .   0010010010 -   @XL_s
+
+## Vector Division Instructions
+
+VDIVSW          000100 ..... ..... ..... 00110001011    @VX
+VDIVUW          000100 ..... ..... ..... 00010001011    @VX
+VDIVSD          000100 ..... ..... ..... 00111001011    @VX
+VDIVUD          000100 ..... ..... ..... 00011001011    @VX
diff --git a/target/ppc/translate/vmx-impl.c.inc 
b/target/ppc/translate/vmx-impl.c.inc
index 6101bca3fd..be35d6fdf3 100644
--- a/target/ppc/translate/vmx-impl.c.inc
+++ b/target/ppc/translate/vmx-impl.c.inc
@@ -3236,6 +3236,65 @@ TRANS(VMULHSD, do_vx_mulh, true , do_vx_vmulhd_i64)
  TRANS(VMULHUW, do_vx_mulh, false, do_vx_vmulhw_i64)
  TRANS(VMULHUD, do_vx_mulh, false, do_vx_vmulhd_i64)

+#define TRANS_VDIV_VMOD(FLAGS, NAME, VECE, FNI4_FUNC, FNI8_FUNC) \

+static bool trans_##NAME(DisasContext *ctx, arg_VX *a)                  \
+{                                                                       \
+    static const GVecGen3 op = {                                        \
+        .fni4 = FNI4_FUNC,                                              \
+        .fni8 = FNI8_FUNC,                                              \
+        .vece = VECE                                                    \
+    };                                                                  \
+                                                                        \
+    REQUIRE_VECTOR(ctx);                                                \
+    REQUIRE_INSNS_FLAGS2(ctx, FLAGS);                                   \
+                                                                        \
+    tcg_gen_gvec_3(avr_full_offset(a->vrt), avr_full_offset(a->vra),    \
+                   avr_full_offset(a->vrb), 16, 16, &op);               \
+                                                                        \
+    return true;                                                        \
+}

Better to use a standalone helper and TRANS() -- the op structure doesn't *need* to bestatic const.

+
+#define DO_VDIV_VMOD(NAME, SZ, DIV, SIGNED)                             \
+static void NAME(TCGv_i##SZ t, TCGv_i##SZ a, TCGv_i##SZ b)              \
+{                                                                       \
+    /*                                                                  \
+     *  If N/0 the instruction used by the backend might deliver        \
+     *  an invalid division signal to the process, so if b = 0 return   \
+     *  N/1 and if signed instruction, the same for a = int_min, b = -1 \
+     */                                                                 \
+    if (SIGNED) {                                                       \
+        TCGv_i##SZ t0 = tcg_temp_new_i##SZ();                           \
+        TCGv_i##SZ t1 = tcg_temp_new_i##SZ();                           \
+        tcg_gen_setcondi_i##SZ(TCG_COND_EQ, t0, a, INT##SZ##_MIN);      \
+        tcg_gen_setcondi_i##SZ(TCG_COND_EQ, t1, b, -1);                 \
+        tcg_gen_and_i##SZ(t0, t0, t1);                                  \
+        tcg_gen_setcondi_i##SZ(TCG_COND_EQ, t1, b, 0);                  \
+        tcg_gen_or_i##SZ(t0, t0, t1);                                   \
+        tcg_gen_movi_i##SZ(t1, 0);                                      \
+        tcg_gen_movcond_i##SZ(TCG_COND_NE, b, t0, t1, t0, b);           \
+        DIV(t, a, b);                                                   \
+        tcg_temp_free_i##SZ(t0);                                        \
+        tcg_temp_free_i##SZ(t1);                                        \
+    } else {                                                            \
+        TCGv_i##SZ zero = tcg_constant_i##SZ(0);                        \
+        TCGv_i##SZ one = tcg_constant_i##SZ(1);                         \
+        tcg_gen_movcond_i##SZ(TCG_COND_EQ, b, b, zero, one, b);         \
+        DIV(t, a, b);                                                   \
+    }                                                                   \
+}

This is overkill. Even if you keep some macros, passing in SIGNED and using it in theoutermost if is a sign you should split the macro in two.

However, only tcg_gen_div_i64 really requires the full signed treatment; tcg_gen_div_i32can be better handled by extending to i64, because INT32_MIN / -1ULL does not trap.


I think this would be much easier to read as 4 separate functions.


r~

[Prev in Thread]

Current Thread

[Next in Thread]

[PATCH v2 0/9] VDIV/VMOD Implementation, Lucas Mateus Castro(alqotel), 2022/04/05
- [PATCH v2 1/9] qemu/int128: add int128_urshift, Lucas Mateus Castro(alqotel), 2022/04/05
  - Re: [PATCH v2 1/9] qemu/int128: add int128_urshift, Richard Henderson, 2022/04/06
- [PATCH v2 2/9] target/ppc: Implemented vector divide instructions, Lucas Mateus Castro(alqotel), 2022/04/05
  - Re: [PATCH v2 2/9] target/ppc: Implemented vector divide instructions, Richard Henderson <=
    - Re: [PATCH v2 2/9] target/ppc: Implemented vector divide instructions, Lucas Mateus Martins Araujo e Castro, 2022/04/20
- [PATCH v2 3/9] target/ppc: Implemented vector divide quadword, Lucas Mateus Castro(alqotel), 2022/04/05
  - Re: [PATCH v2 3/9] target/ppc: Implemented vector divide quadword, Richard Henderson, 2022/04/11
- [PATCH v2 4/9] target/ppc: Implemented vector divide extended word, Lucas Mateus Castro(alqotel), 2022/04/05
  - Re: [PATCH v2 4/9] target/ppc: Implemented vector divide extended word, Richard Henderson, 2022/04/11
- [PATCH v2 5/9] host-utils: Implemented unsigned 256-by-128 division, Lucas Mateus Castro(alqotel), 2022/04/05
  - Re: [PATCH v2 5/9] host-utils: Implemented unsigned 256-by-128 division, Richard Henderson, 2022/04/11
- [PATCH v2 6/9] host-utils: Implemented signed 256-by-128 division, Lucas Mateus Castro(alqotel), 2022/04/05
- [PATCH v2 7/9] target/ppc: Implemented remaining vector divide extended, Lucas Mateus Castro(alqotel), 2022/04/05
  - Re: [PATCH v2 7/9] target/ppc: Implemented remaining vector divide extended, Richard Henderson, 2022/04/11

Prev by Date: Re: [PATCH v2 1/2] hw/core: Sync uboot_image.h from U-Boot v2022.01
Next by Date: Re: [PATCH v2 3/9] target/ppc: Implemented vector divide quadword
Previous by thread: [PATCH v2 2/9] target/ppc: Implemented vector divide instructions
Next by thread: Re: [PATCH v2 2/9] target/ppc: Implemented vector divide instructions
Index(es):
- Date
- Thread