[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: GICv3 for MTTCG
From: |
Andrey Shinkevich |
Subject: |
Re: GICv3 for MTTCG |
Date: |
Thu, 13 May 2021 16:35:43 +0000 |
Dear colleagues,
Thank you all very much for your responses. Let me reply with one message.
I configured QEMU for AARCH64 guest:
$ ./configure --target-list=aarch64-softmmu
When I start QEMU with GICv3 on an x86 host:
qemu-system-aarch64 -machine virt-6.0,accel=tcg,gic-version=3
QEMU reports this error from hw/pci/msix.c:
error_setg(errp, "MSI-X is not supported by interrupt controller");
Probably, the variable 'msi_nonbroken' would be initialized in
hw/intc/arm_gicv3_its_common.c:
gicv3_its_init_mmio(..)
I guess that it works with KVM acceleration only rather than with TCG.
The error persists after applying the series:
https://lists.gnu.org/archive/html/qemu-arm/2021-04/msg00944.html
"GICv3 LPI and ITS feature implementation"
(special thanks for referring me to that)
Please, make me clear and advise ideas how that error can be fixed?
Should the MSI-X support be implemented with GICv3 extra?
When successful, I would like to test QEMU for a maximum number of cores
to get the best MTTCG performance.
Probably, we will get just some percentage of performance enhancement
with the BQL series applied, won't we? I will test it as well.
Best regards,
Andrey Shinkevich
On 5/12/21 6:43 PM, Alex Bennée wrote:
>
> Andrey Shinkevich <andrey.shinkevich@huawei.com> writes:
>
>> Dear colleagues,
>>
>> I am looking for ways to accelerate the MTTCG for ARM guest on x86-64 host.
>> The maximum number of CPUs for MTTCG that uses GICv2 is limited by 8:
>>
>> include/hw/intc/arm_gic_common.h:#define GIC_NCPU 8
>>
>> The version 3 of the Generic Interrupt Controller (GICv3) is not
>> supported in QEMU for some reason unknown to me. It would allow to
>> increase the limit of CPUs and accelerate the MTTCG performance on a
>> multiple core hypervisor.
>
> It is supported, you just need to select it.
>
>> I have got an idea to implement the Interrupt Translation Service (ITS)
>> for using by MTTCG for ARM architecture.
>
> There is some work to support ITS under TCG already posted:
>
> Subject: [PATCH v3 0/8] GICv3 LPI and ITS feature implementation
> Date: Thu, 29 Apr 2021 19:41:53 -0400
> Message-Id: <20210429234201.125565-1-shashi.mallela@linaro.org>
>
> please do review and test.
>
>> Do you find that idea useful and feasible?
>> If yes, how much time do you estimate for such a project to complete by
>> one developer?
>> If no, what are reasons for not implementing GICv3 for MTTCG in QEMU?
>
> As far as MTTCG performance is concerned there is a degree of
> diminishing returns to be expected as the synchronisation cost between
> threads will eventually outweigh the gains of additional threads.
>
> There are a number of parts that could improve this performance. The
> first would be picking up the BQL reduction series from your FutureWei
> colleges who worked on the problem when they were Linaro assignees:
>
> Subject: [PATCH v2 0/7] accel/tcg: remove implied BQL from
> cpu_handle_interrupt/exception path
> Date: Wed, 19 Aug 2020 14:28:49 -0400
> Message-Id: <20200819182856.4893-1-robert.foley@linaro.org>
>
> There was also a longer series moving towards per-CPU locks:
>
> Subject: [PATCH v10 00/73] per-CPU locks
> Date: Wed, 17 Jun 2020 17:01:18 -0400
> Message-Id: <20200617210231.4393-1-robert.foley@linaro.org>
>
> I believe the initial measurements showed that the BQL cost started to
> edge up with GIC interactions. We did discuss approaches for this and I
> think one idea was use non-BQL locking for the GIC. You would need to
> revert:
>
> Subject: [PATCH-for-5.2] exec: Remove MemoryRegion::global_locking field
> Date: Thu, 6 Aug 2020 17:07:26 +0200
> Message-Id: <20200806150726.962-1-philmd@redhat.com>
>
> and then implement a more fine tuned locking in the GIC emulation
> itself. However I think the BQL and per-CPU locks are lower hanging
> fruit to tackle first.
>
>>
>> Best regards,
>> Andrey Shinkevich
>
>