[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] Profiling Qemu for speed?

From: John R. Hogerhuis
Subject: Re: [Qemu-devel] Profiling Qemu for speed?
Date: Sun, 17 Apr 2005 01:21:14 -0700

On Sat, 2005-04-16 at 22:58 -0700, Joe Luser wrote:
> Hi folks,
> I've been using Qemu on the Mac for a few days now; Several OSes
> running (including Windoze), and I'm impressed. The source looks pretty
> clean, too.
> Has anyone done any profiling work to see where Qemu spends most of its
> processing time? It is fast running x86 code on the PPC, but I'd like
> to find ways to speed things up even more.
> If you know of areas that will make a big difference in speed, please
> let me know.
> -Nathaniel G H

>From what I've read the Cirrus SVGA emulation probably deserves some
attention. Read through the archives, there have been some recent

Beyond that what will always be left is continually tweaking the dynamic
code generator with whatever heuristics and host platform specific stuff
you can conjure up. Pretty much no end to that sort of thing.

QEMU takes executing machine instructions from one virtual computer and
dynamically translates them to the working equivalent on another (the
host), stores them in cache to save reprocessing time (that's the *big*
time-saving heuristic over Bochs) and executes the dynamically generated
code on the host.

Think of it like this: aside from speed, the best generator of dynamic
code would be an expert assembly language programmer on the platform you
are translating code from and to. As you can imagine QEMU has tricks and
leverages GCC but it will never attain the nirvana of being as good as
the assembly language programmer. That's an asymptote. So that's why I
say there will always be lots of things you can do on the dynamic code
generator to speed it up.

One thought would be to have a peephole optimizer that looks back over
the just translated basic block (or a state machine that matches such
sequences as an on-line algorithm) and match against common, known
primitive sequences, and replaces them with optimized versions.

The kind of profiling you would want to do here is to run, say, windows
and take a snapshot of the dynamic code cache, and look for common
instruction sequences. Ideally, you could write some software to do this

Anyway, I'm sure there are lots of other ideas laying around.

-- John.

reply via email to

[Prev in Thread] Current Thread [Next in Thread]