qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] profiling qemu


From: Lluís Vilanova
Subject: Re: [Qemu-devel] profiling qemu
Date: Tue, 14 Feb 2012 17:57:22 +0100
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/24.0.93 (gnu/linux)

Artyom Tarasenko writes:
>> [...]
>>> Here it looks like "compute_all_sub" and "compute_all_sub_xcc" are
>>> good candidates for optimizing: together they take the same amount of
>>> time as cpu_sparc_exec. I guess both operations would be trivial in
>>> the x86_64 assembler. What would be the best strategy to make TCG take
>>> the advantage of running on a x86_64 host?
>> 
>> A quick look into the code reveals that these two are called from a TCG 
>> helper
>> (helper_compute_psr), so I see two approaches here applicable to the most
>> frequently used "sub-operations" in helper_compute_psr:
>> 
>> * Define new simpler helpers for those sub-operations that can be declared 
>> with
>>  TCG_CALL_CONST and generate the new psr/xcc values in temporal registers. 
>> You
>>  must make sure any other code will still be able to use the new psr/xcc
>>  values.

> I don't see how to make get_C_sub_xcc even simpler: all it does is the
> src1 < src2 check.

Well, the current helpers are not declared as TCG_CALL_CONST, which means that
all temporal registers with values derived from tcg_global_mem_new are saved to
their canonical location before calling the helper (e.g., store tmp_reg into
env->pcr iif tmp_reg contains a value that was transitively loaded from
env->pcr). When the helper returns, these temporal registers are restored with
the current env->pcr value, even if it was not modified (this applies to all
"users" of tcg_global_mem_new).

This can add a noticeable overhead that can be removed if you declare new
helpers with the TCG_CALL_CONST flag and use TCG registers all over to perform
the computations. But then you must ensure that these are saved on their
canonical location after computing them, or ensure that other uses of those
registers go through the proper TCG registers (aka, don't use env->pcr in
helpers if others are modifying TCG registers that cache that value but don't
store it back into env->pcr).

Hope it's clearer now.


>> * Reimplement these sub-operations in pure TCG code.

> Are there already examples where we compute flags in pure TCG code?

I'm pretty sure there are in x86, but I don't think that is the simplest example
to start from.


Lluis

-- 
 "And it's much the same thing with knowledge, for whenever you learn
 something new, the whole world becomes that much richer."
 -- The Princess of Pure Reason, as told by Norton Juster in The Phantom
 Tollbooth



reply via email to

[Prev in Thread] Current Thread [Next in Thread]