qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] Profiling Qemu for speed?


From: Ian Rogers
Subject: Re: [Qemu-devel] Profiling Qemu for speed?
Date: Mon, 18 Apr 2005 15:19:17 +0100
User-agent: Mozilla Thunderbird 0.8 (X11/20040913)

Sorry, I was responding to Karl Magdsick's point about the cost of switch statements relating to Nathaniel G.H.'s point about the cost of translation/generation. FDO works in the case of interpreters and translators from my experience as code sequences are pretty predictable things (you just don't tend to choose instructions at random). Improving dynamically generated code is the best way to improve performance, as that's where you spend your time. However, the time spent compiling means you only want to optimise hot spots. Well thought out simple just-in-time compilation is often very hard to beat with an optimising compiler, for example, knowing your target architecture well means you can get a really good static register mapping to the host architecture. These lessons are well known in the literature. I'd still be interested to know if the translator's profiled performance improved using FDO. I have results from writing things in Java, where FDO is par for the course, but they hardly apply to QEMU/GCC.

Regards,

Ian Rogers
-- http://www.binarytranslator.org/

Daniel Egger wrote:

On 18.04.2005, at 11:51, Ian Rogers wrote:

I'm not sure if you can get GCC to generate code sequences like this, but you probably at least need to use the -fprofile-generate and -fprofile-use options
http://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html


Feedback optimisation (FDO) will not work for two reasons:
a) qemu itself is something like a realtime compiler so FDO
   will only speed up the compiler but not the generated code
b) FDO will only provide speed boosts if the feedback phase
   has a chance to analyse a representative work pattern that
   is hopefully also repetitive

After all FDO is mostly about making a tradeoff size/speed
and rearranging code (mostly branches) to avoid branch
mispredictions of the CPU.

Servus,
      Daniel






reply via email to

[Prev in Thread] Current Thread [Next in Thread]