qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] qemu vs gcc4


From: Paul Brook
Subject: Re: [Qemu-devel] qemu vs gcc4
Date: Tue, 31 Oct 2006 23:00:42 +0000
User-agent: KMail/1.9.5

On Tuesday 31 October 2006 22:31, Laurent Desnogues wrote:
> Paul Brook a écrit :
> > Replacing the pregenerated blocks with hand written assembly isn't
> > feasible. Each target has its own set of ops, and each host would need
> > its own assembly implementation of those ops. Multiply 11 targets by 11
> > hosts and you get a unmaintainable mess :-)
>
> Shouldn't you have 11+11 and not 11*11, given your intermediate
> representation?  And of these 11+11, 11 have to be written
> anyway (target).  Or did I miss something?

If you use qops (which is a target and host independent intermediate 
representation) it's 11 + 11. If you just replace the existing dyngen op.c 
with hand written assembly it's 11 * 11.

> > On RISC targets like ARM most instructions don't set the condition codes,
> > so we don't bother doing this.
>
> Except for ARM Thumb ISA which always sets flags.  ARM is a bad
> RISC example :)

Bah. Details :-)

> I was wondering if you did some profiling to know how much time
> is spent in disas_arm_insn.  Of course the profiling results
> would be very different for a Linux boot or a synthetic benchmark

The qop generator does add some overhead to the code translation. I haven't 
done proper benchmarks, but in most cases it doesn't seem to be too bad 
(maybe 10%). I'm hoping we can get most of that back.

> (which makes me think that you don't support MMU, do you?).

qemu does implement a MMU.
Currently this still uses the dyngen code, but that's fixable.

> There is a very nice trick to speed up decoding of ARM
> instructions:  pick up bits 20-27 and 4-7 and you (almost) get
> one instruction per case entry;  of course this means using a
> generator to write the 4096 entries, but the result was good for
> my interpreted ISS, reaching 44 M i/s on an Opteron @2.4GHz
> without any compiler dependent trick (such as gcc jump to labels).

qemu generally gets 100-200MIPS on my 2GHz Opteron.

Paul




reply via email to

[Prev in Thread] Current Thread [Next in Thread]