[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [PATCH 27/36] target/arm: Convert Neon VABA 3-reg-same to decodetree
From: |
Richard Henderson |
Subject: |
Re: [PATCH 27/36] target/arm: Convert Neon VABA 3-reg-same to decodetree |
Date: |
Thu, 30 Apr 2020 19:29:43 -0700 |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.7.0 |
On 4/30/20 11:09 AM, Peter Maydell wrote:
> + for (pass = 0; pass < (a->q ? 4 : 2); pass++) {
> + tmp = neon_load_reg(a->vn, pass);
> + tmp2 = neon_load_reg(a->vm, pass);
> + abd_fn(tmp, tmp, tmp2);
> + tcg_temp_free_i32(tmp2);
> + tmp2 = neon_load_reg(a->vd, pass);
> + add_fn(tmp, tmp, tmp2);
> + tcg_temp_free_i32(tmp2);
> + neon_store_reg(a->vd, pass, tmp);
> + }
> + return true;
> +}
> +
> +static bool trans_VABA_S_3s(DisasContext *s, arg_3same *a)
> +{
> + static NeonGenTwoOpFn * const abd_fns[] = {
> + gen_helper_neon_abd_s8,
> + gen_helper_neon_abd_s16,
> + gen_helper_neon_abd_s32,
> + };
> + static NeonGenTwoOpFn * const add_fns[] = {
> + gen_helper_neon_add_u8,
> + gen_helper_neon_add_u16,
> + tcg_gen_add_i32,
> + };
This can be packaged into one operation. E.g.
static void gen_aba_s8(TCGv_i32 d, TCGv_i32 n, TCGv_i32 m)
{
TCGv_i32 t = tcg_temp_new_i32();
gen_helper_neon_abd_s8(t, n, m);
gen_helper_neon_add_u8(d, d, t);
tcg_temp_free_i32(t);gen_aba_s8
}
static const GVecGen3 op = {
.fni4 = gen_aba_s8,
.load_dest = true
};
etc.
FWIW, this is one that I've fully converted on my sve2 branch. aba(n,m,a) =
max(n,m) - min(n,m) + a -- four fully vectorized operations. So anything that
allows a drop-in replacement would be nice. But whatever is easiest for you.
r~
- [PATCH 23/36] target/arm: Convert Neon 64-bit element 3-reg-same insns, (continued)
- [PATCH 23/36] target/arm: Convert Neon 64-bit element 3-reg-same insns, Peter Maydell, 2020/04/30
- [PATCH 25/36] target/arm: Convert Neon VRHADD, VHSUB, VABD 3-reg-same insns to decodetree, Peter Maydell, 2020/04/30
- [PATCH 21/36] target/arm: Convert Neon 3-reg-same SHA to decodetree, Peter Maydell, 2020/04/30
- [PATCH 24/36] target/arm: Convert Neon VHADD 3-reg-same insns, Peter Maydell, 2020/04/30
- [PATCH 26/36] target/arm: Convert Neon VQSHL, VRSHL, VQRSHL 3-reg-same insns to decodetree, Peter Maydell, 2020/04/30
- [PATCH 27/36] target/arm: Convert Neon VABA 3-reg-same to decodetree, Peter Maydell, 2020/04/30
- Re: [PATCH 27/36] target/arm: Convert Neon VABA 3-reg-same to decodetree,
Richard Henderson <=
- [PATCH 29/36] target/arm: Convert Neon VPADD 3-reg-same insns to decodetree, Peter Maydell, 2020/04/30
- [PATCH 31/36] target/arm: Convert Neon VADD, VSUB, VABD 3-reg-same insns to decodetree, Peter Maydell, 2020/04/30
- [PATCH 30/36] target/arm: Convert Neon VQDMULH/VQRDMULH 3-reg-same to decodetree, Peter Maydell, 2020/04/30
- [PATCH 32/36] target/arm: Convert Neon VPMIN/VPMAX/VPADD float 3-reg-same insns to decodetree, Peter Maydell, 2020/04/30
- [PATCH 33/36] target/arm: Convert Neon fp VMUL, VMLA, VMLS 3-reg-same insns to decodetree, Peter Maydell, 2020/04/30
- [PATCH 28/36] target/arm: Convert Neon VPMAX/VPMIN 3-reg-same insns to decodetree, Peter Maydell, 2020/04/30