qemu-ppc
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v2] target/ppc: add vmsumudm vmsumcud instructions


From: Lijun Pan
Subject: Re: [PATCH v2] target/ppc: add vmsumudm vmsumcud instructions
Date: Fri, 19 Jun 2020 00:10:06 -0500


> On Jun 18, 2020, at 6:09 PM, Richard Henderson <richard.henderson@linaro.org> 
> wrote:
> 
> On 6/15/20 1:53 PM, Lijun Pan wrote:
>>>> +static inline void uint128_add(uint64_t ah, uint64_t al, uint64_t bh,
>>>> +          uint64_t bl, uint64_t *rh, uint64_t *rl, uint64_t *ca)
>>>> +{
>>>> +  __uint128_t a = (__uint128_t)ah << 64 | (__uint128_t)al;
>>>> +  __uint128_t b = (__uint128_t)bh << 64 | (__uint128_t)bl;
>>>> +  __uint128_t r = a + b;
>>>> +
>>>> +  *rh = (uint64_t)(r >> 64);
>>>> +  *rl = (uint64_t)r;
>>>> +  *ca = (~a < b);
>>>> +}
>>> 
>>> This is *not* what I had in mind at all.
>>> 
>>> int128.h should be operating on Int128, and *not* component uint64_t values.
>> 
>> Should uint128_add() be included in a new file called uint128.h? or still at 
>> host-utils.h?
> 
> If you want this sort of specific operation, you should leave it in 
> target/ppc/.
> 
> I had been hoping that you could make use of Int128 as-is, or with minimal
> adjustment in the same style.
> 
>> vmsumudm/vmsumcud operate as follows:
>> 1. 128-bit prod1 = (high 64 bits of a) * (high 64 bits of b), // I reuse 
>> mulu64()

This is an implementation not relying on 128 bit compiler support (not defined 
CONFIG_INT128), 
hence using mulu64().

>> 2. 128-bit prod2 = (high 64 bits of b) * (high 64 bits of b), // I reuse 
>> mulu64()
>> 3. 128-bit result = prod1 + prod2 + c; // I added addu128() in v1, renamed 
>> it to uint128_add() in v2
> 
> Really?  That seems a very odd computation.  Your code,
> 
>> +    prod1 = (__uint128_t)ah * (__uint128_t)bh;
>> +    prod2 = (__uint128_t)al * (__uint128_t)bl;
>> +    r = prod1 + prod2 + c;
> 
> is slightly different, but still very odd.

Above 3 lines of code are using 128 bit compiler suppor (#ifdef CONFIG_INT128). 

> 
> Why would we be adding the intermediate 128th bit of the 256-bit product
> (prod1, bit 0) with the 0th bit of the 256-bit product (prod2, bit 0).
> 
> Unfortunately, I can't find the v3.1 spec online yet, so I can't look at this
> myself.  What is the instruction supposed to produce?

https://ibm.ent.box.com/s/hhjfw0x0lrbtyzmiaffnbxh2fuo0fog0

> 
>> To better understand your request, may I ask you several questions:
>> 1. keep mulsum() in target/ppc/int_helper.c?
> 
> Probably.
> 
>> If so, it will inevitably have  #ifdef CONFIG_INT128 #else #endif in that 
>> function.  
> 
> No, you don't have to ifdef.  You can use uint64_t alone and not rely on
> compiler support for __uint128_t at all.
> 
> 
> r~
> 




reply via email to

[Prev in Thread] Current Thread [Next in Thread]