[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] Redundant repz prefixes in generated amd64 code
From: |
axel |
Subject: |
Re: [Qemu-devel] Redundant repz prefixes in generated amd64 code |
Date: |
Sat, 17 Mar 2007 08:35:17 +0100 |
User-agent: |
KMail/1.9.5 |
On Friday 16 March 2007 20:30, Igor Kovalenko wrote:
> On 3/16/07, Julian Seward <address@hidden> wrote:
> > I'm seeing redundant repz (0xF3) prefixes in generated code, typically
> > just before jumps:
> >
> > <code_gen_buffer+415>: repz mov $0xe07f,%eax
> > <code_gen_buffer+421>: mov %eax,0x20(%rbp)
> > <code_gen_buffer+424>: lea -25168302(%rip),%ebx # 0xaf0420 <tbs+96>
> > <code_gen_buffer+430>: retq
> > <code_gen_buffer+431>: mov -25168245(%rip),%eax # 0xaf0460 <tbs+160>
> > <code_gen_buffer+437>: jmpq *%rax
> > <code_gen_buffer+439>: repz mov $0xe092,%eax
> > <code_gen_buffer+445>: mov %eax,0x20(%rbp)
> > <code_gen_buffer+448>: lea -25168325(%rip),%ebx # 0xaf0421 <tbs+97>
> > <code_gen_buffer+454>: retq
> >
> > I assume these are something to do with translation chaining/unchaining
> > but have been unable to figure out where they come from. I know they get
> > executed are so are not data - valgrind barfs on them.
> >
> > This is on a 64-bit host (Core 2) with qemu-0.9.0 compiled from source by
> > gcc-3.4.6, running an x86 (32-bit) guest.
> >
> > At a guess I'd say the mov $imm,%eax is (created by? to do with?)
> > gen_jmp_im in target-i386/translate.c, but I don't see how the F3
> > got in on the act. Grepping the source for 0xF3 turns up nothing
> > plausible. Any ideas where it comes from and how to get rid of it?
>
> Try -mtune=nocona something like the following
IMHO one should change dyngen. Below a hack (elf only, I can not test the COFF
branch). It works for amd64->amd64 (tested with -no-kqemu), but is not save,
because the instruction before the ret may contain the 0xf3 byte as immediate
operand.
A full solution would dissassemble the whole function, determine the borders
of the opcode and then decide, where to cut the block to copy. Perhaps one
could then also detect multiple returns in a function and one could try to
rewrite the opcode blocks replacing the multiple returns with jumps.
Why there exist two different blocks for COFF and ELF for x86/x86_64 hosts?
Axel
Index: dyngen.c
===================================================================
RCS file: /sources/qemu/qemu/dyngen.c,v
retrieving revision 1.49
diff -u -r1.49 dyngen.c
--- dyngen.c 4 Mar 2007 00:52:16 -0000 1.49
+++ dyngen.c 17 Mar 2007 07:19:41 -0000
@@ -1458,6 +1458,8 @@
error("empty code for %s", name);
if (p_end[-1] == 0xc3) {
len--;
+ if ( len>0 && p_end[-2] == 0xf3 )
+ --len;
} else {
error("ret or jmp expected at the end of %s", name);
}