[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] [PATCH v3 2/6] Add copy and constant propagation.
From: |
Blue Swirl |
Subject: |
Re: [Qemu-devel] [PATCH v3 2/6] Add copy and constant propagation. |
Date: |
Thu, 4 Aug 2011 19:24:31 +0000 |
On Thu, Aug 4, 2011 at 6:42 PM, Blue Swirl <address@hidden> wrote:
> On Wed, Aug 3, 2011 at 9:03 PM, Stefan Weil <address@hidden> wrote:
>> Am 03.08.2011 22:56, schrieb Stefan Weil:
>>>
>>> Am 03.08.2011 22:20, schrieb Blue Swirl:
>>>>
>>>> On Wed, Aug 3, 2011 at 7:00 PM, Stefan Weil <address@hidden> wrote:
>>>>>
>>>>> Am 07.07.2011 14:37, schrieb Kirill Batuzov:
>>>>>>
>>>>>> Make tcg_constant_folding do copy and constant propagation. It is a
>>>>>> preparational work before actual constant folding.
>>>>>>
>>>>>> Signed-off-by: Kirill Batuzov<address@hidden>
>>>>>> ---
>>>>>> tcg/optimize.c | 182
>>>>>> +++++++++++++++++++++++++++++++++++++++++++++++++++++++-
>>>>>> 1 files changed, 180 insertions(+), 2 deletions(-)
>>>>>>
>>>>>> diff --git a/tcg/optimize.c b/tcg/optimize.c
>>>>>> index c7c7da9..f8afe71 100644
>>>>>> --- a/tcg/optimize.c
>>>>>> +++ b/tcg/optimize.c
>>>>>>
>>>>>
>>>>> ...
>>>>>
>>>>> This patch breaks QEMU on 32 bit hosts (tested on 386 Linux
>>>>> and w32 hosts). Simply running qemu (BIOS only) terminates
>>>>> with abort(). As the error is easy to reproduce, I don't provide
>>>>> a stack frame here.
>>>>
>>>> I can't reproduce, i386/Linux and win32 versions of i386, Sparc32 and
>>>> Sparc64 emulators work fine.
>>>>
>>>> Maybe you have a stale build (bug in Makefile dependencies)?
>>>
>>> Sorry, an important information was wrong / missing in my report.
>>> It's not qemu, but qemu-system-x86_64 which fails to work.
>>>
>>> I just tested it once more with a new build:
>>>
>>> $ bin/x86_64-softmmu/qemu-system-x86_64 -L pc-bios
>>> /qemu/tcg/tcg.c:1646: tcg fatal error
>>> Abgebrochen
>
> OK, now that is broken also for me.
>
>>> Cheers,
>>> Stefan
>>
>> qemu-system-mips64el fails with the same error, so the problem
>> occurs when running 64 bit emulations on 32 bit hosts.
>
> Not always, Sparc64 still works fine.
x86_64 fails because 'mov_i32 cc_src_0,loc25' is incorrectly optimized
to 'mov_i32 cc_src_0,tmp6' where tmp6 is dead after brcond.
IN:
0x000000000ffeb90a: shl %cl,%eax
OP:
---- 0xffeb90a
mov_i32 tmp2,rcx_0
mov_i32 tmp3,rcx_1
mov_i32 tmp0,rax_0
mov_i32 tmp1,rax_1
movi_i32 tmp20,$0x1f
and_i32 tmp2,tmp2,tmp20
movi_i32 tmp3,$0x0
movi_i32 tmp21,$0xffffffff
movi_i32 tmp22,$0xffffffff
add2_i32 tmp16,tmp17,tmp2,tmp3,tmp21,tmp22
movi_i32 tmp20,$0x80bd4e0
call tmp20,$0x30,$2,tmp6,tmp7,tmp0,tmp1,tmp16,tmp17
...tmp6 is assigned here...
movi_i32 tmp20,$0x80bd4e0
call tmp20,$0x30,$2,tmp0,tmp1,tmp0,tmp1,tmp2,tmp3
mov_i32 rax_0,tmp0
movi_i32 rax_1,$0x0
mov_i32 loc23,tmp0
mov_i32 loc24,tmp1
mov_i32 loc25,tmp6
...tmp6 saved to loc25 to survive brcond...
mov_i32 loc26,tmp7
movi_i32 tmp21,$0x0
movi_i32 tmp22,$0x0
brcond2_i32 tmp2,tmp3,tmp21,tmp22,eq,$0x0
mov_i32 cc_src_0,loc25
...used here.
mov_i32 cc_src_1,loc26
mov_i32 cc_dst_0,loc23
mov_i32 cc_dst_1,loc24
movi_i32 cc_op,$0x24
set_label $0x0
movi_i32 tmp8,$0xffeb90c
movi_i32 tmp9,$0x0
st_i32 tmp8,env,$0x80
st_i32 tmp9,env,$0x84
movi_i32 tmp20,$debug
call tmp20,$0x0,$0
OP after liveness analysis:
---- 0xffeb90a
mov_i32 tmp2,rcx_0
nopn $0x2,$0x2
mov_i32 tmp0,rax_0
mov_i32 tmp1,rax_1
movi_i32 tmp20,$0x1f
and_i32 tmp2,tmp2,tmp20
movi_i32 tmp3,$0x0
movi_i32 tmp21,$0xffffffff
movi_i32 tmp22,$0xffffffff
add2_i32 tmp16,tmp17,tmp2,tmp3,tmp21,tmp22
movi_i32 tmp20,$0x80bd4e0
call tmp20,$0x30,$2,tmp6,tmp7,tmp0,tmp1,tmp16,tmp17
OK
movi_i32 tmp20,$0x80bd4e0
call tmp20,$0x30,$2,tmp0,tmp1,tmp0,tmp1,tmp2,tmp3
mov_i32 rax_0,tmp0
movi_i32 rax_1,$0x0
mov_i32 loc23,tmp0
mov_i32 loc24,tmp1
mov_i32 loc25,tmp6
OK, though loc25 is unused after this, why it is not optimized away?
mov_i32 loc26,tmp7
movi_i32 tmp21,$0x0
movi_i32 tmp22,$0x0
brcond2_i32 tmp2,tmp3,tmp21,tmp22,eq,$0x0
mov_i32 cc_src_0,tmp6
Incorrect optimization.
mov_i32 cc_src_1,tmp7
mov_i32 cc_dst_0,tmp0
mov_i32 cc_dst_1,tmp1
movi_i32 cc_op,$0x24
set_label $0x0
movi_i32 tmp8,$0xffeb90c
movi_i32 tmp9,$0x0
st_i32 tmp8,env,$0x80
st_i32 tmp9,env,$0x84
movi_i32 tmp20,$debug
call tmp20,$0x0,$0
end
The corresponding translation code is in target-i386/translate.c:1456,
it looks correct.
Maybe the optimizer should consider stack and memory temporaries
different from register temporaries?