[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Slowness with multi-thread TCG?

From: Frederic Barrat
Subject: Re: Slowness with multi-thread TCG?
Date: Wed, 29 Jun 2022 17:36:44 +0200
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.10.0

On 29/06/2022 00:17, Alex Bennée wrote:
If you run the sync-profiler (via the HMP "sync-profile on") you can
then get a breakdown of which mutex's are being held and for how long
("info sync-profile").

Alex, a huge thank you!

For the record, the "info sync-profile" showed:
Type Object Call site Wait Time (s) Count Average (us)
BQL mutex 0x55eb89425540 accel/tcg/cpu-exec.c:744 96.31578 73589937 1.31 BQL mutex 0x55eb89425540 target/ppc/helper_regs.c:207 0.00150 1178 1.27

And it points to a lock in the interrupt delivery path, in cpu_handle_interrupt().

I now understand the root cause. The interrupt signal for the decrementer interrupt remains set because the interrupt is not being delivered, per the config. I'm not quite sure what the proper fix is yet (there seems to be several implementations of the decrementer on ppc), but at least I understand why we are so slow.

With a quick hack, I could verify that by moving that signal out of the way, the decompression time of the kernel is now peanuts, no matter the number of cpus. Even with one cpu, the 15 seconds measured before was already a huge waste, so it was not really a multiple-cpus problem. Multiple cpus were just highlighting it.

Thanks again!


reply via email to

[Prev in Thread] Current Thread [Next in Thread]