qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v3 16/16] tcg/loongarch64: Implement 128-bit load & store


From: gaosong
Subject: Re: [PATCH v3 16/16] tcg/loongarch64: Implement 128-bit load & store
Date: Mon, 4 Sep 2023 09:43:56 +0800
User-agent: Mozilla/5.0 (X11; Linux loongarch64; rv:68.0) Gecko/20100101 Thunderbird/68.7.0

Hi, yijun

在 2023/9/3 上午9:10, Jiajie Chen 写道:

On 2023/9/3 09:06, Richard Henderson wrote:
On 9/1/23 22:02, Jiajie Chen wrote:
If LSX is available, use LSX instructions to implement 128-bit load &
store.

Is this really guaranteed to be an atomic 128-bit operation?


Song Gao, please check this.


Could you explain this issue?  Thanks.

Or, as for many vector processors, is this really two separate 64-bit memory operations under the hood?


+static void tcg_out_qemu_ldst_i128(TCGContext *s, TCGReg data_lo, TCGReg data_hi, +                                   TCGReg addr_reg, MemOpIdx oi, bool is_ld)
+{
+    TCGLabelQemuLdst *ldst;
+    HostAddress h;
+
+    ldst = prepare_host_addr(s, &h, addr_reg, oi, true);
+    if (is_ld) {
+        tcg_out_opc_vldx(s, TCG_VEC_TMP0, h.base, h.index);
+        tcg_out_opc_vpickve2gr_d(s, data_lo, TCG_VEC_TMP0, 0);
+        tcg_out_opc_vpickve2gr_d(s, data_hi, TCG_VEC_TMP0, 1);
+    } else {
+        tcg_out_opc_vinsgr2vr_d(s, TCG_VEC_TMP0, data_lo, 0);
+        tcg_out_opc_vinsgr2vr_d(s, TCG_VEC_TMP0, data_hi, 1);
+        tcg_out_opc_vstx(s, TCG_VEC_TMP0, h.base, h.index);
+    }

You should use h.aa.atom < MO_128 to determine if 128-bit atomicity, and therefore the vector operation, is required.  I assume the gr<->vr moves have a cost and two integer operations are preferred when allowable.

Compare the other implementations of this function.


r~




reply via email to

[Prev in Thread] Current Thread [Next in Thread]