[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [PATCH v3] target/arm: Fix sve2 ldnt1 and stnt1
From: |
Peter Maydell |
Subject: |
Re: [PATCH v3] target/arm: Fix sve2 ldnt1 and stnt1 |
Date: |
Tue, 8 Mar 2022 11:56:59 +0000 |
On Tue, 8 Mar 2022 at 03:18, Richard Henderson
<richard.henderson@linaro.org> wrote:
>
> For both ldnt1 and stnt1, the meaning of the Rn and Rm are different
> from ld1 and st1: the vector and integer registers are reversed, and
> the integer register 31 refers to XZR instead of SP.
>
> Secondly, the 64-bit version of ldnt1 was being interpreted as
> 32-bit unpacked unscaled offset instead of 64-bit unscaled offset,
> which discarded the upper 32 bits of the address coming from
> the vector argument.
>
> Thirdly, validate that the memory element size is in range for the
> vector element size for ldnt1. For ld1, we do this via independent
> decode patterns, but for ldnt1 we need to do it manually.
>
> Resolves: https://gitlab.com/qemu-project/qemu/-/issues/826
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
> target/arm/sve.decode | 5 ++-
> target/arm/translate-sve.c | 51 +++++++++++++++++++++++++++++--
> tests/tcg/aarch64/test-826.c | 50 ++++++++++++++++++++++++++++++
> tests/tcg/aarch64/Makefile.target | 4 +++
> tests/tcg/configure.sh | 4 +++
> 5 files changed, 109 insertions(+), 5 deletions(-)
> create mode 100644 tests/tcg/aarch64/test-826.c
>
> diff --git a/target/arm/sve.decode b/target/arm/sve.decode
> index c60b9f0fec..0388cce3bd 100644
> --- a/target/arm/sve.decode
> +++ b/target/arm/sve.decode
> @@ -1575,10 +1575,9 @@ USDOT_zzzz 01000100 .. 0 ..... 011 110 .....
> ..... @rda_rn_rm
>
> ### SVE2 Memory Gather Load Group
>
> -# SVE2 64-bit gather non-temporal load
> -# (scalar plus unpacked 32-bit unscaled offsets)
> +# SVE2 64-bit gather non-temporal load (scalar plus 64-bit unscaled offsets)
> LDNT1_zprz 1100010 msz:2 00 rm:5 1 u:1 0 pg:3 rn:5 rd:5 \
> - &rprr_gather_load xs=0 esz=3 scale=0 ff=0
> + &rprr_gather_load xs=2 esz=3 scale=0 ff=0
We correct the xs= value here...
>
> # SVE2 32-bit gather non-temporal load (scalar plus 32-bit unscaled offsets)
> LDNT1_zprz 1000010 msz:2 00 rm:5 10 u:1 pg:3 rn:5 rd:5 \
> diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c
> index 33ca1bcfac..2c23459e76 100644
> --- a/target/arm/translate-sve.c
> +++ b/target/arm/translate-sve.c
> @@ -6487,10 +6487,33 @@ static bool trans_LD1_zpiz(DisasContext *s,
> arg_LD1_zpiz *a)
>
> static bool trans_LDNT1_zprz(DisasContext *s, arg_LD1_zprz *a)
> {
> + gen_helper_gvec_mem_scatter *fn = NULL;
> + bool be = s->be_data == MO_BE;
> + bool mte = s->mte_active[0];
> +
> + if (a->esz < a->msz + !a->u) {
> + return false;
> + }
> if (!dc_isar_feature(aa64_sve2, s)) {
> return false;
> }
> - return trans_LD1_zprz(s, a);
> + if (!sve_access_check(s)) {
> + return true;
> + }
> +
> + switch (a->esz) {
> + case MO_32:
> + fn = gather_load_fn32[mte][be][0][0][a->u][a->msz];
> + break;
> + case MO_64:
> + fn = gather_load_fn64[mte][be][0][2][a->u][a->msz];
> + break;
> + }
...but then instead of using it we hard code use of 0 or 2 here,
which makes the change above a bit moot.
> + assert(fn != NULL);
> +
> + do_mem_zpz(s, a->rd, a->pg, a->rn, 0,
> + cpu_reg(s, a->rm), a->msz, false, fn);
> + return true;
> }
Anyway
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
thanks
-- PMM