qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [Consult] tilegx: About floating point instructions


From: Chen Gang
Subject: Re: [Qemu-devel] [Consult] tilegx: About floating point instructions
Date: Sun, 9 Aug 2015 01:23:32 +0800
User-agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.10; rv:31.0) Gecko/20100101 Thunderbird/31.7.0

Hello all:

Below is my current idea for all floating point insns. For me, it is not
the precise implementation, even not completely implement -- assume pack
insns can only for packing (u)int32_t when they are used individually:

  fsingle_add1        ; return calc flags, save calc result to env.

  fsingle_sub1        ; return calc flags, save calc result to env.

  fsingle_addsub2     ; set "has result" flag.

  fsingle_mul1        ; skip return value, save calc result to env.
                        set "has result" flag.

  fsingle_mul2        ; skipped.


  fsingle_pack1       ; skipped.

  fsingle_pack1       ; if "has result"
                            reset "has result" flag.
                            return calc result from env.
                        else
                            pack srca 
                            reference from tilegx.md: float(uns)sisf2.
                            get (u)int32_t a, then (u)int32_to_float32.

  fdouble_unpack_max: ; skipped.

  fdouble_unpack_min: ; skipped.

  fdouble_add_flags:  ; return calc flags, save calc result to env.

  fdouble_sub_flags:  ; return calc flags, save calc result to env.

  fdouble_addsub:     ; set "has result" flag.

  fdouble_mul_flags:  ; skip return flags, save calc result to env.
                        set "has result" flag.

  fdouble_pack1:      ; if "has result" 
                            reset "has result" flag.
                            return calc result from env.
                        else
                            pack srca and srcb.
                            reference from tilegx.md: float(uns)sidf2.
                            get (u)int32_t a, then (u)int32_to_float64.

  fdouble_pack2:      ; skipped.


  (fsingle_add1/sub1, fdouble_add/sub_flags can be used individually,
   e.g gcc testsuit for complex number).


Next, I shall implement the floating point insns, welcome any related
ideas, suggestions, and completions.

Thanks.


On 8/5/15 22:16, Chen Gang wrote:
> On 8/4/15 23:04, Richard Henderson wrote:
>> On 08/04/2015 06:56 AM, Chen Gang wrote:
>>>
>>> On 8/4/15 04:47, Chen Gang wrote:
>>>> On 8/4/15 00:40, Richard Henderson wrote:
>>>>> On 08/01/2015 02:47 AM, Chen Gang wrote:
>>>>>> I am just adding floating point instructions (e.g. fsingle_add1),
>>>>>> but for me, I can not find any details about them (the ISA
>>>>>> documents only give a summary description, but not details), e.g.
>>>>>
>>>>> The tilegx splits the four/six cycle arithmetic into multiple
>>>>> black-box instructions.  You need only really implement one of the
>>>>> four, with the rest of them being implemented as nops or moves.
>>>>>
>>>>> Looking at what gcc produces gives the hints:
>>>>>
>>>>> fdouble_unpack_min        min, srca, srcb fdouble_unpack_max      max, 
>>>>> srca,
>>>>> srcb fdouble_add_flags    flg, srca, srcb fdouble_addsub          max, 
>>>>> min, flg 
>>>>> fdouble_pack1             dst, max, flg fdouble_pack2             dst, 
>>>>> max, zero
>>>>>
>>>>> The unpack, addsub, and pack2 insns can be ignored, the add_flags
>>>>> insn can perform the whole operation, the pack1 insn performs a move
>>>>> from "flg" to "dst".
>>>>>
>>>>> Similarly for the single-precision:
>>>>>
>>>>> fsingle_add1              tmp, srca, srcb fsingle_addsub2         tmp, 
>>>>> srca, srcb 
>>>>> fsingle_pack1             flg, tmp fsingle_pack2          dst, tmp, flg
>>>>>
>>>>> The add1 insn performs the whole operation, the addsub2 and pack1
>>>>> insns are ignored, and the pack2 insn is a move from tmp to dst.
>>>>>
>>>
>>> After check the tilegx.md completely, for me, we still need implement
>>> each of them precisely, or we can not emulate all cases (e.g. muldf3).
>>
>> No, you can still implement all of muldf3 in fdouble_mul_flags.
>> Again, the fdouble_pack1 copies from the flag input to the output.
>>
>> Yes, there is a 64-bit multiply in there, but the tcg optimizer
>> should be able to delete all of that as unused.  Especially if you have the
>> fdouble_unpack* insns store zero into their destinations.
>>
> 
> For me, I am not quite sure. But I guess, what you said should be OK (at
> least, what you said is very useful for the implementation).
> 
> 
>> Don't get me wrong -- more accurate implementation of the actual
>> insns would be nice, especially for debugging.  But if the insns
>> aren't accurately documented I don't see what choice we have.
>>
> 
> For me, I guess, we can still try to implement the details.
> 
>  - The document has all floating point instructions' summary, so we can
>    think of, or guess its implementation entirely.
> 
>  - gcc uses them all and completely, so it is our good sample and good
>    reference (but we should not assume gcc must be correct, since we
>    just use qemu for gcc testsuite).
> 
>  - Tilegx floating point format should be standard (at least, reference
>    to the standard format), so we can reference the related information
>    from google/baidu.
> 
> 
>> On the good side, implementing the entire operation as part of the "flags" 
>> step
>> probably results in faster emulation.
>>
> 
> I guess so, too.
> 
> 
> I shall try to finish the simple implementation, firstly. Then try to
> implement the floating point instructions in details in the future (it
> should be lower priority).
> 
> 
> Thanks.
> 

-- 
Chen Gang

Open, share, and attitude like air, water, and life which God blessed



reply via email to

[Prev in Thread] Current Thread [Next in Thread]