[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [RFC v5 4/4] cpus: extract out accel-specific code to each accel
From: |
Claudio Fontana |
Subject: |
Re: [RFC v5 4/4] cpus: extract out accel-specific code to each accel |
Date: |
Tue, 16 Jun 2020 20:02:00 +0200 |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.4.1 |
On 6/16/20 7:52 PM, Alex Bennée wrote:
>
> Claudio Fontana <cfontana@suse.de> writes:
>
>> Hi Alex,
>>
>> thanks for looking at this,
>>
>> On 6/16/20 4:16 PM, Alex Bennée wrote:
>>>
>>> Claudio Fontana <cfontana@suse.de> writes:
>>>
>>>> each accelerator registers a new "CpusAccel" interface
>>>> implementation on initialization, providing functions for
>>>> starting a vcpu, kicking a vcpu, and sychronizing state.
>>>>
>>>> This way the code in cpus.c is now all general softmmu code,
>>>> nothing accelerator-specific anymore.
>>>>
>>>> There is still some ifdeffery for WIN32 though.
>>>>
>>>> Signed-off-by: Claudio Fontana <cfontana@suse.de>
>>>> ---
>>>> MAINTAINERS | 1 +
>>>> accel/Makefile.objs | 2 +-
>>>> accel/kvm/Makefile.objs | 2 +
>>>> accel/kvm/kvm-all.c | 15 +-
>>>> accel/kvm/kvm-cpus.c | 94 +++++
>>>> accel/kvm/kvm-cpus.h | 17 +
>>>> accel/qtest/Makefile.objs | 2 +
>>>> accel/qtest/qtest-cpus.c | 105 +++++
>>>> accel/qtest/qtest-cpus.h | 17 +
>>>> accel/{ => qtest}/qtest.c | 7 +
>>>> accel/stubs/kvm-stub.c | 3 +-
>>>> accel/tcg/Makefile.objs | 1 +
>>>> accel/tcg/tcg-all.c | 12 +-
>>>> accel/tcg/tcg-cpus.c | 523 ++++++++++++++++++++++++
>>>> accel/tcg/tcg-cpus.h | 17 +
>>>> hw/core/cpu.c | 1 +
>>>> include/sysemu/cpus.h | 32 ++
>>>> include/sysemu/hw_accel.h | 57 +--
>>>> include/sysemu/kvm.h | 2 +-
>>>> softmmu/cpus.c | 911
>>>> ++++--------------------------------------
>>>> stubs/Makefile.objs | 1 +
>>>> stubs/cpu-synchronize-state.c | 15 +
>>>> target/i386/Makefile.objs | 7 +-
>>>> target/i386/hax-all.c | 6 +-
>>>> target/i386/hax-cpus.c | 85 ++++
>>>> target/i386/hax-cpus.h | 17 +
>>>> target/i386/hax-i386.h | 2 +
>>>> target/i386/hax-posix.c | 12 +
>>>> target/i386/hax-windows.c | 20 +
>>>> target/i386/hvf/Makefile.objs | 2 +-
>>>> target/i386/hvf/hvf-cpus.c | 141 +++++++
>>>> target/i386/hvf/hvf-cpus.h | 17 +
>>>> target/i386/hvf/hvf.c | 3 +
>>>> target/i386/whpx-all.c | 3 +
>>>> target/i386/whpx-cpus.c | 96 +++++
>>>> target/i386/whpx-cpus.h | 17 +
>>>> 36 files changed, 1362 insertions(+), 903 deletions(-)
>>>> create mode 100644 accel/kvm/kvm-cpus.c
>>>> create mode 100644 accel/kvm/kvm-cpus.h
>>>> create mode 100644 accel/qtest/Makefile.objs
>>>> create mode 100644 accel/qtest/qtest-cpus.c
>>>> create mode 100644 accel/qtest/qtest-cpus.h
>>>> rename accel/{ => qtest}/qtest.c (86%)
>>>> create mode 100644 accel/tcg/tcg-cpus.c
>>>> create mode 100644 accel/tcg/tcg-cpus.h
>>>> create mode 100644 stubs/cpu-synchronize-state.c
>>>> create mode 100644 target/i386/hax-cpus.c
>>>> create mode 100644 target/i386/hax-cpus.h
>>>> create mode 100644 target/i386/hvf/hvf-cpus.c
>>>> create mode 100644 target/i386/hvf/hvf-cpus.h
>>>> create mode 100644 target/i386/whpx-cpus.c
>>>> create mode 100644 target/i386/whpx-cpus.h
>>>
>>> Predictably for such a spider patch I got a bunch of conflicts
>>> attempting to merge on my testing branch so only a few comments.
>>>
>>>>
>>>> diff --git a/MAINTAINERS b/MAINTAINERS
>>>> index f308537d42..ef8cbb2680 100644
>>>> --- a/MAINTAINERS
>>>> +++ b/MAINTAINERS
>>>> @@ -427,6 +427,7 @@ WHPX CPUs
>>>> M: Sunil Muthuswamy <sunilmut@microsoft.com>
>>>> S: Supported
>>>> F: target/i386/whpx-all.c
>>>> +F: target/i386/whpx-cpus.c
>>>> F: target/i386/whp-dispatch.h
>>>> F: accel/stubs/whpx-stub.c
>>>> F: include/sysemu/whpx.h
>>>> diff --git a/accel/Makefile.objs b/accel/Makefile.objs
>>>> index ff72f0d030..c5e58eb53d 100644
>>>> --- a/accel/Makefile.objs
>>>> +++ b/accel/Makefile.objs
>>>> @@ -1,5 +1,5 @@
>>>> common-obj-$(CONFIG_SOFTMMU) += accel.o
>>>> -obj-$(call land,$(CONFIG_SOFTMMU),$(CONFIG_POSIX)) += qtest.o
>>>> +obj-$(call land,$(CONFIG_SOFTMMU),$(CONFIG_POSIX)) += qtest/
>>>
>>> This does raise the question if qtest is "just another" accelerator then
>>> should we not be creating a CONFIG_QTEST symbol for explicitness?
>>>
>>>> obj-$(CONFIG_KVM) += kvm/
>>>> obj-$(CONFIG_TCG) += tcg/
>>>> obj-$(CONFIG_XEN) += xen/
>>> <snip>
>>>> +static void *qtest_cpu_thread_fn(void *arg)
>>>> +{
>>>> +#ifdef _WIN32
>>>> + error_report("qtest is not supported under Windows");
>>>> + exit(1);
>>>> +#else
>>>
>>> This is literally impossible to build isn't it?
>>>>
>>>> static int qtest_init_accel(MachineState *ms)
>>>> {
>>>> + cpus_register_accel(&qtest_cpus);
>>>> return 0;
>>>> }
>>>
>>> I wonder if these register functions could be moved to initfns like we
>>> use for our hardware models?
>>
>> The context is the configure_accelerator() in vl.c , where we loop over
>> possible candidate accelerators
>> and try to initialize them.
>>
>> In this RFC the cpus_register_accel is triggered at accel_init_machine()
>> time,
>> in the accelerator class init_machine() method, where we are trying to use a
>> specific accelerator.
>>
>> This is the case for qtest like for the other AccelClass types (tcg and the
>> hardware accelerators).
>>
>> If not in init_machine(), where would the registration best happen?
>
> Ahh I see - this is once the decision about which accelerator has been
> made. I was thinking along the lines of the init functions driven by:
>
> #define type_init(function) module_init(function, MODULE_INIT_QOM)
>
> which would then populate the list of available accelerators in a more
> QOM like manner. I assume having a completely configurable set of
> accelerators is the eventual aim of this?
Yes, in my plan the change to add target-specific modules and eventually
CONFIG_TCG=m, CONFIG_KVM=m etc is basically the last step, the way I see it.
tentative plan is at:
https://lists.gnu.org/archive/html/qemu-devel/2020-05/msg04628.html
>
>>
>>>
>>> <snip>
>>>>
>>>> +/*
>>>> + * every accelerator is supposed to register this.
>>>> + * Could be in the AccelClass instead, but ends up being too complicated
>>>> + * to access in practice, and inefficient for each call of each method.
>>>> + */
>>>> +static CpusAccel cpus_accel;
>>>> +
>>>
>>> wait what? Does an indirection cause that much trouble? I'm surprised
>>> given how often we use it elsewhere in the code. I guess others might
>>
>> CpusAccel is not used elsewhere currently in the codebase, it's new, or what
>> do you mean?
>>
>>> argue for a full QOM-ification of the accelerator but I think we can at
>>> least have an indirection rather than a copy of the structure.
>>>
>>>
>>
>> As mentioned in v3 and v2, this is what we end up if we put CpusAccel inside
>> the AccelClass,
>> every time we need a vcpu kick, sync state, etc:
>>
>> 1) current_accel() function call
>> 2) pointer dereference (->accelerator)
>> 3) object_class_dynamic_cast_assert function call (ACCEL_GET_CLASS ->
>> OBJECT_CLASS_CHECK)
>> 4) pointer dereference (-> AccelCpusInterface)
>> 5) pointer dereference (-> method)
>> 6) function call ( ->synchronize_state(cpu))
>>
>> So the code then would look like this (more or less, probably I would put
>> also an assert for non-NULL in there):
>>
>> VERSION A)
>>
>> void cpu_synchronize_state(CPUState *cpu)
>> {
>> ACCEL_GET_CLASS(current_accel())->cpus_int->synchronize_state(cpu);
>> }
>
> I don't think it has to be quite so extreme. I was just arguing for
> something along the lines of:
>
> static CpuAccel *accel;
>
> and
>
> void cpu_synchronize_state(CPUState *cpu)
> {
> if (accel && accel->synchronize_state) {
> accel->synchronize_state(cpu);
> }
> }
>
>> Instead with the current RFC code, this is what we end up with every
>> time we need a vcpu kick, sync state, etc:
>
> I don't think a pointer de-reference alone is super critical for
> something that happens on the outside of the main run loop. It might be
> a different argument if this was somewhere in the hot path.
>
>> Are you arguing in favor of VERSION A) here?
>
> Version C ;-)
Aha, ok I am comfortable with version C.
>
>>
>> I would like to have an ACK from the owners of the hardware accels
>> especially that the additional overhead in this code path
>> is of negligible importance..
>>
>>
>> Thank you for your comments,
>>
Ciao,
Claudio
[RFC v5 3/4] cpu-timers, icount: new modules, Claudio Fontana, 2020/06/15