[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v5 57/60] target/riscv: vector slide instructions

From: Richard Henderson
Subject: Re: [PATCH v5 57/60] target/riscv: vector slide instructions
Date: Mon, 16 Mar 2020 10:42:56 -0700
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.4.1

On 3/16/20 1:04 AM, LIU Zhiwei wrote:
>> As a preference, I think you can do away with this helper.
>> Simply use the slideup helper with argument 1, and then
>> afterwards store the integer register into element 0.  You should be able to
>> re-use code from vmv.s.x for that.
> When I try it, I find it is some difficult, because  vmv.s.x will clean
> the elements (0 < index < VLEN/SEW).

Well, two things about that:

(1) The 0.8 version of vmv.s.x does *not* zero the other elements, so we'll
want to be prepared for that.

(2) We have 8 insns that, in the end come down to a direct element access,
possibly with some other processing.

So we'll want basic helper functions that can locate an element by immediate
offset and by variable offset:

/* Compute the offset of vreg[idx] relative to cpu_env.
   The index must be in range of VLMAX. */
int vec_element_ofsi(int vreg, int idx, int sew);

/* Compute a pointer to vreg[idx].
   If need_bound is true, mask idx into VLMAX,
   Otherwise we know a-priori that idx is already in bounds. */
void vec_element_ofsx(DisasContext *s, TCGv_ptr base,
                      TCGv idx, int sew, bool need_bound);

/* Load idx >= VLMAX ? 0 : vreg[idx] */
void vec_element_loadi(DisasContext *s, TCGv_i64 val,
                       int vreg, int idx, int sew);
void vec_element_loadx(DisasContext *s, TCGv_i64 val,
                       int vreg, TCGv idx, int sew);

/* Store vreg[imm] = val.
   The index must be in range of VLMAX.  */
void vec_element_storei(DisasContext *s, int vreg, int imm,
                        TCGv_i64 val);
void vec_element_storex(DisasContext *s, int vreg,
                        TCGv idx, TCGv_i64 val);

(3) It would be handy to have TCGv cpu_vl.


    If rs1 == 0,
        Use vec_element_loadi(s, x[rd], vs2, 0, s->sew).
        Use vec_element_loadx(s, x[rd], vs2, x[rs1], true).

    over = gen_new_label();
    tcg_gen_brcondi_tl(TCG_COND_EQ, cpu_vl, 0, over);
    For 0.7.1:
        Use tcg_gen_dup8i to zero all VLMAX elements of vd.
        If rs1 == 0, goto done.
    Use vec_element_storei(s, vs2, 0, x[rs1]).

    Use vec_element_loadi(x, f[rd], vs2, 0).
    NaN-box f[rd] as necessary for SEW.

    tcg_gen_brcondi_tl(TCG_COND_EQ, cpu_vl, 0, over);
    For 0.7.1:
        Use tcg_gen_dup8i to zero all VLMAX elements of vd.
    Let tmp = f[rs1], nan-boxed as necessary for SEW.
    Use vec_element_storei(s, vs2, 0, tmp).

    Ho hum, I forgot about masking.  Some options:
    (1) Call a helper just as you did in your original patch.
    (2) Call a helper only for !vm, for vm as below.
    (3) Call vslideup w/1.
        tcg_gen_brcondi(TCG_COND_EQ, cpu_vl, 0, over);
        If !vm,
            // inline test for v0[0]
            vec_element_loadi(s, tmp, 0, 0, MO_8);
            tcg_gen_andi_i64(tmp, tmp, 1);
            tcg_gen_brcondi(TCG_COND_EQ, tmp, 0, over);
        Use vec_element_store(s, vd, 0, x[rs1]).

    For !vm, this is complicated enough for a helper.
    If using option 3 for vslide1up, then the store becomes:
    tcg_gen_subi_tl(tmp, cpu_vl, 1);
    vec_element_storex(s, base, tmp, x[rs1]);

    If !vm or !vl_eq_vlmax, use helper.
    vec_element_loadx(s, tmp, vs2, x[rs1]);
    Use tcg_gen_gvec_dup_i64 to store to tmp to vd.

    If !vm or !vl_eq_vlmax, use helper.
    If imm >= vlmax,
        Use tcg_gen_dup8i to zero vd;
        ofs = vec_element_ofsi(s, vs2, imm, s->sew);
        tcg_gen_gvec_dup_mem(sew, vreg_ofs(vd),
                             ofs, vlmax, vlmax);


reply via email to

[Prev in Thread] Current Thread [Next in Thread]