[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] x86 tcg problem
From: |
Blue Swirl |
Subject: |
Re: [Qemu-devel] x86 tcg problem |
Date: |
Tue, 29 Jul 2008 20:18:25 +0300 |
On 7/29/08, Vince Weaver <address@hidden> wrote:
> Hello
>
> I've spent a day now trying to figure out why bzip2 compress/decompress
> doesn't work when using sparc32plus-linux-user on x86.
>
> I've tracked the problem to the Zero flag being improperly set (attached is
> a small exe/src that reproduces the problem.. it reports "Greater"
> on real hardware, "Less Than" on qemu current).
>
> The issue seems to be a misordering of an x86 sub instruction. I tried to
> track this down in the tcg code but I quickly got lost.
>
> The code does this for a compare (on sparc the compare turns into a
> subtract with result as the [ignores] zero reg):
>
> mov_i32 cc_src_0,g4_0 ;
> mov_i32 cc_src_1,g4_1 ; load g4 (0xaae60)
> mov_i32 cc_src2_0,g3_0 ;
> mov_i32 cc_src2_1,g3_1 ; load g3 (0)
> sub2_i32
> cc_dst_0,cc_dst_1,cc_src2_0,cc_src2_1,cc_src_0,cc_src_1
> ; result = 0xaafe0-0
> movi_i32 psr,$0x0 ; clear psr
> mov_i32 tmp42,cc_dst_0 ; get cc_dst_0
> movi_i32 tmp43,$0x0 ;
> movi_i32 tmp44,$0x0 ;
> movi_i32 tmp45,$0x0 ; zero extends
> brcond2_i32 tmp42,tmp43,tmp44,tmp45,$0x1,$0x0 ; if not
> zero, skip
> movi_i32 tmp19,$0x400000 ; else set zero flag
>
>
>
> which converts into x86:
> 0xb80da04d: sub %ecx,%eax ; %ecx = g4-g3
> 0xb80da04f: sbb %ebx,%edx
> 0xb80da051: mov %eax,0x6c(%ebp) ; saving g3, not the result (ecx)!
> 0xb80da054: mov %edx,0x70(%ebp) ;
> 0xb80da057: xor %edx,%edx
> 0xb80da059: xor %ecx,%ecx ; clearing our result for use as
> psr
> ; result is lost!
> ; the later test for zero is done
> ; against g3 instead, which
> ; sets the zero flag when it
> ... ; shouldn't
> 0xb80da06f: test %eax,%eax
> 0xb80da071: jne 0xb80da091 ; skip if not zero
> ..
> 0xb80da07f: mov 0x8c(%ebp),%eax ; load psr
> 0xb80da085: or $0x400000,%eax ; set zero flag
>
>
> So unless there's some weird AT&T/intel ordering thing that is confusing me
> (please let me know if I am missing something), TCG is getting confused
> about which argument of the subtract is the result. I'm not sure how to fix
> this though...
Thank you for the analysis! IIRC sub %ecx, %eax is in C:
eax -= ecx;
Still, I can reproduce this, and also amd64 is not correct:
---- 0x1008c
mov_i64 cc_src,g4
mov_i64 cc_src2,g3
sub_i64 cc_dst,cc_src,cc_src2
movi_i32 psr,$0x0
movi_i64 tmp22,$0xffffffff
and_i64 tmp21,cc_dst,tmp22
movi_i64 tmp22,$0x0
brcond_i64 tmp21,tmp22,$0x1,$0x0
0x601c287b: mov 0x20(%r14),%rcx
0x601c287f: mov %rdx,%r8
0x601c2882: mov %rcx,%r9
0x601c2885: sub %r8,%r9
0x601c2888: mov %r9,%rax
0x601c288b: and $0xffffffff,%eax
0x601c2891: mov %rsi,0x10a58(%r14)
0x601c2898: mov %rdi,0x10a60(%r14)
0x601c289f: mov %rcx,0x60(%r14)
0x601c28a3: mov %r8,0x68(%r14)
0x601c28a7: mov %r9,0x70(%r14)
0x601c28ab: xor %edi,%edi
0x601c28ad: mov %edi,0x90(%r14)
0x601c28b4: mov %rdx,0x18(%r14)
0x601c28b8: test %rax,%rax
0x601c28bb: jne 0x601c28d5
Though gen_op_sub_cc C flag generation part looks suspicious.