qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] Redundant repz prefixes in generated amd64 code


From: axel
Subject: Re: [Qemu-devel] Redundant repz prefixes in generated amd64 code
Date: Sat, 17 Mar 2007 08:35:17 +0100
User-agent: KMail/1.9.5

On Friday 16 March 2007 20:30, Igor Kovalenko wrote:
> On 3/16/07, Julian Seward <address@hidden> wrote:
> > I'm seeing redundant repz (0xF3) prefixes in generated code, typically
> > just before jumps:
> >
> > <code_gen_buffer+415>:  repz mov $0xe07f,%eax
> > <code_gen_buffer+421>:  mov    %eax,0x20(%rbp)
> > <code_gen_buffer+424>:  lea    -25168302(%rip),%ebx  # 0xaf0420 <tbs+96>
> > <code_gen_buffer+430>:  retq
> > <code_gen_buffer+431>:  mov    -25168245(%rip),%eax  # 0xaf0460 <tbs+160>
> > <code_gen_buffer+437>:  jmpq   *%rax
> > <code_gen_buffer+439>:  repz mov $0xe092,%eax
> > <code_gen_buffer+445>:  mov    %eax,0x20(%rbp)
> > <code_gen_buffer+448>:  lea    -25168325(%rip),%ebx   # 0xaf0421 <tbs+97>
> > <code_gen_buffer+454>:  retq
> >
> > I assume these are something to do with translation chaining/unchaining
> > but have been unable to figure out where they come from.  I know they get
> > executed are so are not data - valgrind barfs on them.
> >
> > This is on a 64-bit host (Core 2) with qemu-0.9.0 compiled from source by
> > gcc-3.4.6, running an x86 (32-bit) guest.
> >
> > At a guess I'd say the mov $imm,%eax is (created by? to do with?)
> > gen_jmp_im in target-i386/translate.c, but I don't see how the F3
> > got in on the act.  Grepping the source for 0xF3 turns up nothing
> > plausible.  Any ideas where it comes from and how to get rid of it?
>
> Try -mtune=nocona something like the following

IMHO one should change dyngen. Below a hack (elf only, I can not test the COFF 
branch). It works for amd64->amd64 (tested with -no-kqemu), but is not save, 
because the instruction before the ret may contain the 0xf3 byte as immediate 
operand. 
A full solution would dissassemble the whole function, determine the borders 
of the opcode and then decide, where to cut the block to copy. Perhaps one 
could then also detect multiple returns in a function and one could try to 
rewrite the opcode blocks replacing the multiple returns with jumps.

Why there exist two different blocks for COFF and ELF for x86/x86_64 hosts?

Axel

Index: dyngen.c
===================================================================
RCS file: /sources/qemu/qemu/dyngen.c,v
retrieving revision 1.49
diff -u -r1.49 dyngen.c
--- dyngen.c    4 Mar 2007 00:52:16 -0000       1.49
+++ dyngen.c    17 Mar 2007 07:19:41 -0000
@@ -1458,6 +1458,8 @@
             error("empty code for %s", name);
         if (p_end[-1] == 0xc3) {
             len--;
+           if ( len>0 && p_end[-2] == 0xf3 )
+               --len;
         } else {
             error("ret or jmp expected at the end of %s", name);
         }




reply via email to

[Prev in Thread] Current Thread [Next in Thread]