qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] qemu-user-linux: how could I measure performance for aa


From: Matwey V. Kornilov
Subject: Re: [Qemu-devel] qemu-user-linux: how could I measure performance for aarch64 and arm?
Date: Sun, 13 Jan 2019 10:31:13 +0300

пт, 11 янв. 2019 г. в 22:24, Matwey V. Kornilov <address@hidden>:
>
> пт, 11 янв. 2019 г. в 12:52, Peter Maydell <address@hidden>:
> >
> > On Thu, 10 Jan 2019 at 19:33, Matwey V. Kornilov
> > <address@hidden> wrote:
> > > I am running the same application compiled for aarch64 and armv7l on
> > > x86_64 platform using qemu-user-linux tools.
> > >
> > > I see dramatic performance difference (30 times) between emulated
> > > architectures: aarch64 runs for ~4 minutes, armv7l runs for ~2 hours.
> > > I do understand that CPU architecture emulation is inherently slow
> > > thing, but my question is about the difference.
> > >
> > > How could I debug to understand what is the reason for such a big
> > > difference? I've already tried to run stress-ng compiled for this two
> > > architectures, but it leads to the same performance per second.
> > >
> > > I am running qemu 2.11, should I try other version?
> >
> > Yes, do try 3.1 -- we have done some overall TCG performance
> > improvements.
>
> Indeed, qemu-arm from master runs for 4 minutes where 2.11 runs for 2
> hours for me. It is impressive improvement.

I've managed to bisected the first good (fast) commit:

commit 2a53535af471f4bee9d6cb5b363746b8d5ed21dd
Author: Luke Shumaker <address@hidden>
Date:   Thu Dec 28 13:08:13 2017 -0500

    linux-user: init_guest_space: Try to make ARM space+commpage continuous

Though I am not sure, how does it help.

>
> >
> > For a big difference between target architectures like that,
> > I would try starting by using some host performance tools on
> > the two runs to see where all the time is being taken in
> > the armv7l guest run -- is it all in translated guest code,
> > or is there more time (proportionally) spent in particular
> > parts of the QEMU C code? Does the armv7l version do
> > many more or different syscalls (check with the QEMU -strace
> > option) ?
> >
> > Also you should check performance on h/w 32 bit vs
> > 64-bit Arm if you can, to confirm that it's not just
> > that the guest application runs much slower there.
> > (If you don't have the arm hardware you could at least
> > check x86 32-bit vs 64-bit.)
> >
> > thanks
> > -- PMM
>
>
>
> --
> With best regards,
> Matwey V. Kornilov



-- 
With best regards,
Matwey V. Kornilov



reply via email to

[Prev in Thread] Current Thread [Next in Thread]