[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [patch] performance improvement (softmmu, x86, GCC 3)

From: Piotr Krysik
Subject: Re: [Qemu-devel] [patch] performance improvement (softmmu, x86, GCC 3)
Date: Wed, 4 Aug 2004 05:50:18 -0700 (PDT)

The "ecx thing" and disabling GCSE are not mutually
exclusive, but I didn't try to run/benchmark QEMU with
both. I'm not GCC guru, but I believe that it should not
significantly impact QEMU performance. If you are willing
to do some tests I could send you the "ecx" patch.
And yes, I tried different combinations of -fno-gcse
suboptions, but none worked.
To get more information about the problem, I used
compiler -da flag to trace GCC optimizations of
op_rolb_kernel_T0_T1_cc. I discovered that GCSE step
is introducing transformation that cannot be optimized
later. GCC insists on using copy of T0 value, instead of
using register ebx globally reserved for T0 (and as there
are no free register it gives error). The strangest thing
I noticed is that if I inline stXXXX function by hand instead
of using inline directive, problem disappears.


Andr韂raga <address@hidden> wrote:
Awesome ;)

I haven't dug into the code, so could you please tell me if the ecx
thing you mentioned in the bottom of your message and disabling GCSE
are mutually exclusive? Have you tried to narrow the problem down to
one or more of the separate GCSE flags, instead of the broader
-f[no-]gcse one?

"A year spent in artificial intelligence is enough to make one believe in God"
Alan J. Perlis

On Wed, 28 Jul 2004 07:24:42 -0700 (PDT), Piotr Krysik
> Hi!
> I'm attaching a small patch to enable assembly
> implementation of ld, lds and st (from
> softmmu_header.h) for GCC 3.3 and GCC 3.4 when
> running softmmu x86 guest on x86 host.
> With my simple benchmark (dd if=/dev/zero bs=1M
> count=16 | gzip -9 on Linux guest) this patch
> improves performance by about 8% (QEMU compiled
> with GCC 3.3 on Pentium II Debian host).
> Regards,
> Piotrek
> PS. I also considered removing "%ecx" from register
> constraints of st (softmmu_header.h, line 224) and
> explicitly saving ecx before calling __st (line 198),
> but performance gain was much smaller. I suspect that
> gcse optimization and asm blocks under GCC 3.3 and
> GCC 3.4 don't mix well in QEMU.

Qemu-devel mailing list

Do You Yahoo!?
Tired of spam? Yahoo! Mail has the best spam protection around

reply via email to

[Prev in Thread] Current Thread [Next in Thread]