[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[PATCH 28/61] target/arm: Implement SME2 ADD/SUB (array results, multipl
From: |
Richard Henderson |
Subject: |
[PATCH 28/61] target/arm: Implement SME2 ADD/SUB (array results, multiple and single vector) |
Date: |
Thu, 6 Feb 2025 11:56:42 -0800 |
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
target/arm/tcg/translate.h | 2 ++
target/arm/tcg/translate-sme.c | 29 +++++++++++++++++++++++++++++
target/arm/tcg/sme.decode | 15 +++++++++++++++
3 files changed, 46 insertions(+)
diff --git a/target/arm/tcg/translate.h b/target/arm/tcg/translate.h
index c364d977f3..be39adfa86 100644
--- a/target/arm/tcg/translate.h
+++ b/target/arm/tcg/translate.h
@@ -638,6 +638,8 @@ typedef void GVecGen3Fn(unsigned, uint32_t, uint32_t,
uint32_t, uint32_t, uint32_t);
typedef void GVecGen4Fn(unsigned, uint32_t, uint32_t, uint32_t,
uint32_t, uint32_t, uint32_t);
+typedef void GVecGen3FnVar(unsigned, TCGv_ptr, uint32_t, TCGv_ptr, uint32_t,
+ TCGv_ptr, uint32_t, uint32_t, uint32_t);
/* Function prototype for gen_ functions for calling Neon helpers */
typedef void NeonGenOneOpFn(TCGv_i32, TCGv_i32);
diff --git a/target/arm/tcg/translate-sme.c b/target/arm/tcg/translate-sme.c
index 617621d663..09a4da1725 100644
--- a/target/arm/tcg/translate-sme.c
+++ b/target/arm/tcg/translate-sme.c
@@ -691,3 +691,32 @@ static gen_helper_gvec_3_ptr * const f_vector_fminnm[4] = {
};
TRANS_FEAT(FMINNM_n1, aa64_sme2, do_z2z_n1_fpst, a, f_vector_fminnm)
TRANS_FEAT(FMINNM_nn, aa64_sme2, do_z2z_nn_fpst, a, f_vector_fminnm)
+
+static bool do_azz_n1(DisasContext *s, arg_azz_n *a, int esz,
+ GVecGen3FnVar *fn)
+{
+ TCGv_ptr t_za;
+ int svl, n, o_zm;
+
+ if (!sme_smza_enabled_check(s)) {
+ return true;
+ }
+
+ n = a->n;
+ t_za = get_zarray(s, a->rv, a->off, n);
+ o_zm = vec_full_reg_offset(s, a->zm);
+ svl = streaming_vec_reg_size(s);
+
+ for (int i = 0; i < n; ++i) {
+ int o_za = (svl / n * sizeof(ARMVectorReg)) * i;
+ int o_zn = vec_full_reg_offset(s, (a->zn + i) % 32);
+
+ fn(esz, t_za, o_za, tcg_env, o_zn, tcg_env, o_zm, svl, svl);
+ }
+ return true;
+}
+
+TRANS_FEAT(ADD_azz_n1_s, aa64_sme2, do_azz_n1, a, MO_32, tcg_gen_gvec_add_var)
+TRANS_FEAT(SUB_azz_n1_s, aa64_sme2, do_azz_n1, a, MO_32, tcg_gen_gvec_sub_var)
+TRANS_FEAT(ADD_azz_n1_d, aa64_sme2_i16i64, do_azz_n1, a, MO_64,
tcg_gen_gvec_add_var)
+TRANS_FEAT(SUB_azz_n1_d, aa64_sme2_i16i64, do_azz_n1, a, MO_64,
tcg_gen_gvec_sub_var)
diff --git a/target/arm/tcg/sme.decode b/target/arm/tcg/sme.decode
index 470592f4c0..8b81c0a0ce 100644
--- a/target/arm/tcg/sme.decode
+++ b/target/arm/tcg/sme.decode
@@ -245,3 +245,18 @@ URSHL_nn 1100000 1 .. 1 ..... 1011.0 10001 .... 1
@z2z_4x4
SQDMULH_nn 1100000 1 .. 1 ..... 1011.1 00000 .... 0 @z2z_2x2
SQDMULH_nn 1100000 1 .. 1 ..... 1011.1 00000 .... 0 @z2z_4x4
+
+### SME2 Multi-vector Multiple and Single Array Vectors
+
+&azz_n n off rv zn zm
+@azz_nx1_o3 ........ .... zm:4 ...... zn:5 .. off:3 &azz_n rv=%mova_rv
+
+ADD_azz_n1_s 11000001 0010 .... 0 .. 110 ..... 10 ... @azz_nx1_o3 n=2
+ADD_azz_n1_s 11000001 0011 .... 0 .. 110 ..... 10 ... @azz_nx1_o3 n=4
+ADD_azz_n1_d 11000001 0110 .... 0 .. 110 ..... 10 ... @azz_nx1_o3 n=2
+ADD_azz_n1_d 11000001 0111 .... 0 .. 110 ..... 10 ... @azz_nx1_o3 n=4
+
+SUB_azz_n1_s 11000001 0010 .... 0 .. 110 ..... 11 ... @azz_nx1_o3 n=2
+SUB_azz_n1_s 11000001 0011 .... 0 .. 110 ..... 11 ... @azz_nx1_o3 n=4
+SUB_azz_n1_d 11000001 0110 .... 0 .. 110 ..... 11 ... @azz_nx1_o3 n=2
+SUB_azz_n1_d 11000001 0111 .... 0 .. 110 ..... 11 ... @azz_nx1_o3 n=4
--
2.43.0
- [PATCH 31/61] target/arm: Implement SME2 FMLAL, BFMLAL, (continued)
- [PATCH 31/61] target/arm: Implement SME2 FMLAL, BFMLAL, Richard Henderson, 2025/02/06
- [PATCH 29/61] target/arm: Implement SME2 ADD/SUB (array results, multiple vectors), Richard Henderson, 2025/02/06
- [PATCH 27/61] target/arm: Implement SME2 Multiple Vectors SVE Destructive, Richard Henderson, 2025/02/06
- [PATCH 34/61] target/arm: Implement SME2 FVDOT, BFVDOT, Richard Henderson, 2025/02/06
- [PATCH 35/61] target/arm: Rename helper_gvec_*dot_[bh] to *_4[bh], Richard Henderson, 2025/02/06
- [PATCH 37/61] target/arm: Implemement SME2 SDOT, UDOT, USDOT, SUDOT, Richard Henderson, 2025/02/06
- [PATCH 39/61] target/arm: Implement SME2 SMLAL, SMLSL, UMLAL, UMLSL, Richard Henderson, 2025/02/06
- [PATCH 38/61] target/arm: Implement SME2 SVDOT, UVDOT, SUVDOT, USVDOT, Richard Henderson, 2025/02/06
- [PATCH 23/61] target/arm: Implement SME2 BMOPA, Richard Henderson, 2025/02/06
- [PATCH 25/61] target/arm: Introduce gen_gvec_sve2_sqdmulh, Richard Henderson, 2025/02/06
- [PATCH 28/61] target/arm: Implement SME2 ADD/SUB (array results, multiple and single vector),
Richard Henderson <=
- [PATCH 30/61] target/arm: Pass ZA to helper_sve2_fmlal_zz[zx]w_s, Richard Henderson, 2025/02/06
- [PATCH 32/61] target/arm: Implement SME2 FDOT, Richard Henderson, 2025/02/06
- [PATCH 33/61] target/arm: Implement SME2 BFDOT, Richard Henderson, 2025/02/06
- [PATCH 36/61] target/arm: Remove helper_gvec_sudot_idx_4b, Richard Henderson, 2025/02/06
- [PATCH 18/61] target/arm: Split get_tile_rowcol argument tile_index, Richard Henderson, 2025/02/06
- [PATCH 41/61] target/arm: Rename gvec_fml[as]_[hs] with _nf_ infix, Richard Henderson, 2025/02/06
- [PATCH 42/61] target/arm: Implement SME2 FMLA, FMLS, Richard Henderson, 2025/02/06
- [PATCH 40/61] target/arm: Implement SME2 SMLALL, SMLSLL, UMLALL, UMLSLL, Richard Henderson, 2025/02/06
- [PATCH 43/61] target/arm: Implement SME2 BFMLA, BFMLS, Richard Henderson, 2025/02/06
- [PATCH 46/61] target/arm: Implement SME2 BFCVT, BFCVTN, FCVT, FCVTN, Richard Henderson, 2025/02/06