## Re: [PATCH v4 24/47] target/ppc: move vrl[bhwd]nm/vrl[bhwd]mi to decodet

 From: Matheus K. Ferst Subject: Re: [PATCH v4 24/47] target/ppc: move vrl[bhwd]nm/vrl[bhwd]mi to decodetree Date: Wed, 23 Feb 2022 18:43:35 -0300 User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.5.0

```On 22/02/2022 19:30, Richard Henderson wrote:
```On 2/22/22 04:36, matheus.ferst@eldorado.org.br wrote:
```+static void gen_vrlnm_vec(unsigned vece, TCGv_vec vrt, TCGv_vec vra,
+                          TCGv_vec vrb)
+{
+    TCGv_vec mask, n = tcg_temp_new_vec_matching(vrt);
+    /* Create the mask */
+    /* Extract n */
+    tcg_gen_dupi_vec(vece, n, (8 << vece) - 1);
+    tcg_gen_and_vec(vece, n, vrb, n);
+    /* Rotate and mask */
+    tcg_gen_rotlv_vec(vece, vrt, vra, n);
Note that rotlv does the masking itself:

/*
* Expand D = A << (B % element bits)
*
* Unlike scalar shifts, where it is easy for the target front end
* to include the modulo as part of the expansion.  If the target
* naturally includes the modulo as part of the operation, great!
* If the target has some other behaviour from out-of-range shifts,
* then it could not use this function anyway, and would need to
* do it's own expansion with custom functions.
*/

Using tcg_gen_rotlv_vec(vece, vrt, vra, vrb) works on PPC but fails on x86. It looks like a problem on the i386 backend. It's using VPS[RL]LV[DQ], but instead of this modulo behavior, these instructions write zero to the element[1]. I'm not sure how to fix that. Do we need an INDEX_op_shlv_vec case in i386 tcg_expand_vec_op?
```+static bool do_vrlnm(DisasContext *ctx, arg_VX *a, int vece)
+{
+    static const TCGOpcode vecop_list[] = {
+        INDEX_op_cmp_vec, INDEX_op_rotlv_vec, INDEX_op_sari_vec,
+        INDEX_op_shli_vec, INDEX_op_shri_vec, INDEX_op_shrv_vec, 0
+    };
Where is sari used?

I'll remove in v5.

[1] Section 5.3 of https://www.intel.com/content/dam/develop/external/us/en/documents/36945
