[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] Potential to accelerate QEMU for specific architectures

From: Lior Vernia
Subject: Re: [Qemu-devel] Potential to accelerate QEMU for specific architectures
Date: Sun, 26 May 2013 08:40:20 +0300


On Sat, May 25, 2013 at 10:06 PM, Andreas Färber <address@hidden> wrote:
> Hi,
> Am 24.05.2013 21:24, schrieb Lior Vernia:
>> I am running x86 applications on an ARM device using QEMU, and found
>> it too slow for my needs.
> Before we start going into technical details, what are you trying to
> achieve on a high level and how did you try to do it?
> Are you using qemu-system-x86_64 or qemu-x86_64? The latest v1.5.0?

Sorry, right after I wrote the message it occured to me I should have
mentioned that I was talking about qemu-system, either x86 or i386. At
the moment I just ran the limbo app on a Galaxy SIII with various
images, just to see the capabilities, and was disappointed. Limbo
seems to run v1.1.0.

If you suspect that it's the JNI wrapping that's causing a lot of the
damage, then we can talk about compiling QEMU for ARM and running it
natively, I just haven't been able to get that to work.

>> This is to be expected, of course, this is
>> not a complaint.
> Especially since most people still run on x86 ...
>> However, I was wondering whether this could be helped
>> by "overriding" the generic binary translation mechanism and focusing
>> on lower level binary translation just from x86 to ARM.
>> It's clear to me that this isn't a small project, but it might be
>> important enough for me to invest myself in. However, before I jump
>> into it, I wanted to inquire whether this would be worthwhile at all.
>> Does anyone have any estimate as to how big of a gain that could
>> achieve? Or whether a more significant improvement could be achieved
>> by further tweaking that didn't occur to me?

I wanted to add that I've been reading about this Russian startup
that's looking to emulate x86 on ARM at 40% of native speed using
dynamic binary translation (as far as I gather):
So this should be possible. And it can't be very much unlike QEMU, can it?

> ... the tcg/arm/ code does not get a lot of love, so you might be able
> to squeeze some more performance out of it by implementing optional TCG
> ops or optimizing existing implementations. In theory most TCG ops
> should correspond to a machine instruction (where available); there's a
> TCG-level optimizer to create more efficient code, but it's a tradeoff
> between time for code optimization and execution time.
> Needless to say that you should enable -O3 optimization (or something)
> for the core C code and not to enable debug features in configure for
> your performance measurements. :)
> Whatever implementation you experiment with, get familiar with our
> Git-based workflow and try to stay close to qemu.git code or otherwise
> you'll create a fork with little chance of getting integrated into the
> code base - meaning both we don't get your speedups and you don't get
> our latest features and bugfixes. One such example was the attempt to
> use LLVM instead of TCG.

Thanks, but we're getting slightly ahead of ourselves here :) I'd
still want to make sure that QEMU is at fault for the performance, and
if that's the case that there's potential for real improvement before
I start getting my hands dirty .

reply via email to

[Prev in Thread] Current Thread [Next in Thread]