[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] [RFC v1 09/13] target-ppc: add cmpeqb instruction
From: |
Nikunj A Dadhania |
Subject: |
Re: [Qemu-devel] [RFC v1 09/13] target-ppc: add cmpeqb instruction |
Date: |
Sat, 23 Jul 2016 11:38:18 +0530 |
User-agent: |
Notmuch/0.21 (https://notmuchmail.org) Emacs/25.0.94.1 (x86_64-redhat-linux-gnu) |
Richard Henderson <address@hidden> writes:
> On 07/23/2016 12:58 AM, Nikunj A Dadhania wrote:
>> Richard Henderson <address@hidden> writes:
>>
>>> On 07/18/2016 10:35 PM, Nikunj A Dadhania wrote:
>>>> + tcg_gen_andi_tl(src1, cpu_gpr[rA(ctx->opcode)], 0xFF);
>>>> + for (i = 0; i < 64; i += 8) {
>>>> + tcg_gen_shri_tl(t0, arg1, i);
>>>> + tcg_gen_andi_tl(t0, t0, 0xFF);
>>>> + tcg_gen_brcond_tl(TCG_COND_EQ, src1, t0, l1);
>>>> + }
>>>> + tcg_gen_movi_i32(cpu_crf[crfD(ctx->opcode)], 0);
>>>> + tcg_gen_br(l2);
>>>> + gen_set_label(l1);
>>>> + /* Set match bit, i.e. CRF_GT */
>>>> + tcg_gen_movi_i32(cpu_crf[crfD(ctx->opcode)], 1 << CRF_GT);
>>>
>>> Ew. This can be done much better as
>>>
>>> http://graphics.stanford.edu/~seander/bithacks.html#ZeroInWord
>>>
>>> Which still might be best done in a helper, because of the constants
>>> involved
>>> (tcg is not nearly so good as gcc in building full 64-bit constants).
>>>
>>> C.f. target-alpha/int_helper.c, helper_cmpbe0 (which computes different
>>> information than cmpeqb, but is still helpful as an example).
>>
>> Thanks for the hints, that got reduce to following:
>>
>> #define haszero(v) (((v) - 0x0101010101010101UL) & ~(v) &
>> 0x8080808080808080UL)
>> #define hasvalue(x,n) (haszero((x) ^ (~0UL/255 * (n))))
>>
>> uint32_t helper_cmpeqb(target_ulong ra, target_ulong rb)
>> {
>> return !!hasvalue(rb, ra);
>> }
I had it like this:
+ tcg_gen_ext8u_tl(src1, cpu_gpr[rA(ctx->opcode)]);
+ gen_helper_cmpeqb(crf, src1, cpu_gpr[rB(ctx->opcode)]);
+ tcg_gen_shli_i32(crf, crf, CRF_GT);
> A couple of things:
>
> (1) I don't see N being masked to 0xff before replicating.
Was doing that in caller, but as suggested better to do it in the define.
> (2) You need to use ULL for 32-bit hosts, or casts, e.g.
Having it defined only for 64-bit
> #define dup(x) (((x) & 0xff) * (~(target_ulong)0 / 0xff))
> #define haszero(v) (((v) - dup(0x01)) & ~(v) & dup(0x80))
> #define hasvalue(x, n) haszero((x) ^ dup(n))
Yes, had similar stuff but with different names:
+/* if x = 0xab, returns 0xababababababababa */
+#define pattern(x) (~0UL/255 * (x))
+
+/* substract 1 from each byte, AND with inverse, check if MSB is set at each
+ * byte.
+ * i.e. ((0x00 - 0x01) & ~(0x00)) & 0x80
+ * (0xFF & 0xFF) & 0x80 = 0x80 (zero found)
+ */
+#define haszero(v) (((v) - pattern(0x01)) & ~(v) & pattern(0x80))
+
+/* When you XOR the pattern and there is a match, that byte will be zero */
+#define hasvalue(x,n) (haszero((x) ^ pattern(n)))
> (3) You probably ought to go ahead and add compute the proper crf value here:
>
> return hasvalue(rb, ra) ? 1 << CRF_GT : 0;
Sure
> so that within the translator you just have
>
> gen_helper_cmpeqb(cpu_crf[...], cpu_gpr[...], cpu_gpr[...])
Right, that will be a one-liner, will change.
Regards
Nikunj
- [Qemu-devel] [RFC v1 04/13] target-ppc: add cmprb instruction, (continued)
Re: [Qemu-devel] [RFC v1 09/13] target-ppc: add cmpeqb instruction, David Gibson, 2016/07/22
[Qemu-devel] [RFC v1 10/13] target-ppc: add setb instruction, Nikunj A Dadhania, 2016/07/18
[Qemu-devel] [RFC v1 11/13] target-ppc: add maddld instruction, Nikunj A Dadhania, 2016/07/18
[Qemu-devel] [RFC v1 12/13] target-ppc: add maddhd and maddhdu instruction, Nikunj A Dadhania, 2016/07/18