[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-arm] [Qemu-devel] [RFC PATCH 9/9] target/arm/translate-a64: ve
From: |
Richard Henderson |
Subject: |
Re: [Qemu-arm] [Qemu-devel] [RFC PATCH 9/9] target/arm/translate-a64: vectorise smull vD.4s, vN.[48]s, vM.h[] |
Date: |
Thu, 17 Aug 2017 13:23:49 -0700 |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.2.1 |
On 08/17/2017 11:04 AM, Alex Bennée wrote:
> + int32_t *rd = (int32_t *) d;
> + int16_t *rn = (int16_t *) n;
> + int16_t rm = (int16_t) m;
> + int i;
> +
> + #pragma GCC ivdep
> + for (i = 0; i < opr_elt; ++i) {
> + rd[i] = rn[i + doff_elt] * rm;
> + }
You need to run this loop backward to avoid clobbering data when rd == rn.
I thought you'd put m into ADVSIMD_DATA.
>
> + if (is_q) {
> + simd_info = deposit32(simd_info,
> + ADVSIMD_DOFF_ELT_SHIFT,
> ADVSIMD_DOFF_ELT_BITS, 4);
> + }
It'd probably be useful to have a macro to clean this up:
#define PUT_SIMD_DATA(t, d) \
deposit32(0, ADVSIMD_ ## t ## _SHIFT, ADVSIMD_ ## t ## _BITS, (d))
simd_info |= PUT_SIMD_DATA(DOFF_ELT, 4)
that said, folding DOFF into the pointer that gets passed in the first place
seems a better solution to me.
r~
- [Qemu-arm] [RFC PATCH 3/9] tcg: generate ptrs to vector registers, (continued)
- [Qemu-arm] [RFC PATCH 3/9] tcg: generate ptrs to vector registers, Alex Bennée, 2017/08/17
- [Qemu-arm] [RFC PATCH 5/9] arm/cpu.h: align VFP registers, Alex Bennée, 2017/08/17
- [Qemu-arm] [RFC PATCH 4/9] helper-head: add support for vec type, Alex Bennée, 2017/08/17
- [Qemu-arm] [RFC PATCH 2/9] tcg: introduce the concepts of a TCGv_vec register type, Alex Bennée, 2017/08/17
- [Qemu-arm] [RFC PATCH 1/9] tcg/README: listify the TCG types., Alex Bennée, 2017/08/17
- [Qemu-arm] [RFC PATCH 9/9] target/arm/translate-a64: vectorise smull vD.4s, vN.[48]s, vM.h[], Alex Bennée, 2017/08/17
- Re: [Qemu-arm] [Qemu-devel] [RFC PATCH 9/9] target/arm/translate-a64: vectorise smull vD.4s, vN.[48]s, vM.h[],
Richard Henderson <=
- [Qemu-arm] [RFC PATCH 7/9] target/arm/translate-a64: register global vectors, Alex Bennée, 2017/08/17
- [Qemu-arm] [RFC PATCH 6/9] target/arm/translate-a64: regnames -> x_regnames, Alex Bennée, 2017/08/17
- [Qemu-arm] [RFC PATCH 8/9] target/arm/helpers: introduce ADVSIMD flags, Alex Bennée, 2017/08/17
- Re: [Qemu-arm] [Qemu-devel] [RFC PATCH 0/9] TCG Vector types and example conversion, no-reply, 2017/08/17
- Re: [Qemu-arm] [RFC PATCH 0/9] TCG Vector types and example conversion, Kirill Batuzov, 2017/08/18