qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [RFC] alpha qemu arithmetic exceptions


From: Al Viro
Subject: Re: [Qemu-devel] [RFC] alpha qemu arithmetic exceptions
Date: Tue, 8 Jul 2014 18:20:02 +0100
User-agent: Mutt/1.5.21 (2010-09-15)

On Tue, Jul 08, 2014 at 05:33:16PM +0100, Peter Maydell wrote:

> > Incidentally, combination of --enable-gprof and (default) --enable-pie
> > won't build - it dies with ld(1) complaining about relocs in gcrt1.o.
> 
> This sounds like a toolchain bug to me :-)

Debian stable/amd64, gcc 4.7.2, binutils 2.22.  And google search finds
this, for example: http://osdir.com/ml/qemu-devel/2013-05/msg00710.html.
That one has gcc 4.4.3.

Anyway, adding --disable-pie to --enable-gprof gets it to build, but
as I said, gprof is no better than perf and oprofile - same problem.

Stats I quoted were from qemu-system-alpha booting debian/lenny (5.10) and
going through their kernel package build.  I have perf report in front of
me right now; the top ones are
 41.77%  qemu-system-alp  perf-24701.map           [.] 0x7fbbee558930
 11.78%  qemu-system-alp  qemu-system-alpha        [.] cpu_alpha_exec
  4.95%  qemu-system-alp  [vdso]                   [.] 0x7fffdd7ff8de
  2.40%  qemu-system-alp  qemu-system-alpha        [.] phys_page_find
  1.49%  qemu-system-alp  qemu-system-alpha        [.] 
address_space_translate_internal
  1.34%  qemu-system-alp  [kernel.kallsyms]        [k] read_hpet
  1.26%  qemu-system-alp  qemu-system-alpha        [.] tlb_set_page
  1.23%  qemu-system-alp  qemu-system-alpha        [.] find_next_bit
  1.04%  qemu-system-alp  qemu-system-alpha        [.] get_page_addr_code
  1.01%  qemu-system-alp  libpthread-2.13.so       [.] pthread_mutex_lock
  0.88%  qemu-system-alp  qemu-system-alpha        [.] helper_cmpbge
  0.80%  qemu-system-alp  libc-2.13.so             [.] __memset_sse2
  0.72%  qemu-system-alp  libpthread-2.13.so       [.] 
__pthread_mutex_unlock_usercnt
  0.70%  qemu-system-alp  qemu-system-alpha        [.] get_physical_address
  0.69%  qemu-system-alp  qemu-system-alpha        [.] address_space_translate
  0.68%  qemu-system-alp  qemu-system-alpha        [.] tcg_optimize
  0.67%  qemu-system-alp  qemu-system-alpha        [.] ldq_phys
  0.63%  qemu-system-alp  qemu-system-alpha        [.] qemu_get_ram_ptr
  0.62%  qemu-system-alp  qemu-system-alpha        [.] helper_le_ldq_mmu
  0.57%  qemu-system-alp  qemu-system-alpha        [.] memory_region_is_ram

and cpu_alpha_exec() spends most of the time in inlined tb_find_fast().
It might be worth checking the actual distribution of the hash of virt
address used by that sucker - I wonder if dividing its argument by 4
wouldn't improve the things, but I don't have stats on actual frequency
of conflicts, etc.  In any case, the first lump (42%) seems to be tastier ;-)
There are all kinds of microoptimizations possible (e.g. helper_cmpbge() could
be done by a couple of MMX insns on amd64 host[1]), but it would be nice to
have some details on what we spend the time on in tcg output...

[1] The reason why helper_cmpbge() shows up is that string functions on alpha
use that insn a lot; it _might_ be worth optimizing.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]