[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] profiling qemu
From: |
Artyom Tarasenko |
Subject: |
Re: [Qemu-devel] profiling qemu |
Date: |
Tue, 14 Feb 2012 16:01:24 +0100 |
2012/2/14 Lluís Vilanova <address@hidden>:
> Artyom Tarasenko writes:
> [...]
>> QEMU 1.0.50 monitor - type 'help' for more information
>> (qemu) profile
>> unknown command: 'profile'
>> (qemu) info profile
>> async time 38505498320 (38.505)
>> qemu time 35947093161 (35.947)
>
>> Is there a way to find out more?
>
> Command "info jit" also has some information added when compiled with
> profiling
> support.
>
> Search for CONFIG_PROFILER to see which code is activated during profiling.
>
>
>> Next I tried gprof:
>
>> build-prof $ gprof sparc64-softmmu/qemu-system-sparc64 gmon.out
>> Flat profile:
>
>> Each sample counts as 0.01 seconds.
>> % cumulative self self total
>> time seconds seconds calls Ts/call Ts/call name
>> 100.00 5.06 5.06 main
>
>> Hmm. Not very informative. Is there a way to find out more details?
>
> Did you run QEMU for a reasonable amount of time? gprof uses sampling to
> capture
> its execution time statistics, so a small execution of QEMU will not be able
> to
> capture any meaningful information.
I did run it to the OpenBIOS prompt. But I think it's my setup which
makes gprof useless on the machine where I tested git master:
the "host" is a virtual machine itself running under virtual box, and
it has problems with the system timer. Will re-check on a bare metal
host.
> [...]
>> Here it looks like "compute_all_sub" and "compute_all_sub_xcc" are
>> good candidates for optimizing: together they take the same amount of
>> time as cpu_sparc_exec. I guess both operations would be trivial in
>> the x86_64 assembler. What would be the best strategy to make TCG take
>> the advantage of running on a x86_64 host?
>
> A quick look into the code reveals that these two are called from a TCG helper
> (helper_compute_psr), so I see two approaches here applicable to the most
> frequently used "sub-operations" in helper_compute_psr:
>
> * Define new simpler helpers for those sub-operations that can be declared
> with
> TCG_CALL_CONST and generate the new psr/xcc values in temporal registers. You
> must make sure any other code will still be able to use the new psr/xcc
> values.
I don't see how to make get_C_sub_xcc even simpler: all it does is the
src1 < src2 check.
> * Reimplement these sub-operations in pure TCG code.
Are there already examples where we compute flags in pure TCG code?
> But first, make sure you run a proper benchmark to establish where are the
> hotspots in the sparc code for QEMU. The problem here is to establish what a
> proper benchmark is :)
>
:)
Artyom
--
Regards,
Artyom Tarasenko
solaris/sparc under qemu blog: http://tyom.blogspot.com/search/label/qemu