[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] [PATCH] spapr: add "compat" machine option
From: |
Alexander Graf |
Subject: |
Re: [Qemu-devel] [PATCH] spapr: add "compat" machine option |
Date: |
Tue, 5 Nov 2013 10:23:55 +0100 |
On 05.11.2013, at 03:19, Alexey Kardashevskiy <address@hidden> wrote:
> On 10/01/2013 12:49 AM, Alexander Graf wrote:
>> On 09/30/2013 03:22 PM, Alexey Kardashevskiy wrote:
>>> On 30.09.2013 21:25, Alexander Graf wrote:
>>>> On 27.09.2013, at 10:06, Alexey Kardashevskiy wrote:
>
>
> I realized it has been a while since I got your response and did not answer
> :) Sorry for the delay.
>
>
>>>>
>>>>> To be able to boot on newer hardware that the software support,
>>>>> PowerISA defines a logical PVR, one per every PowerISA specification
>>>>> version from 2.04.
>>>>>
>>>>> This adds the "compat" option which takes values 205 or 206 and forces
>>>>> QEMU to boot the guest with a logical PVR (CPU_POWERPC_LOGICAL_2_05 or
>>>>> CPU_POWERPC_LOGICAL_2_06).
>>>>>
>>>>> The guest reads the logical PVR value from "cpu-version" property of
>>>>> a CPU device node.
>>>>>
>>>>> Cc: Nikunj A Dadhania<address@hidden>
>>>>> Cc: Andreas Färber<address@hidden>
>>>>> Signed-off-by: Alexey Kardashevskiy<address@hidden>
>>>>> ---
>>>>> hw/ppc/spapr.c | 40 ++++++++++++++++++++++++++++++++++++++++
>>>>> include/hw/ppc/spapr.h | 2 ++
>>>>> target-ppc/cpu-models.h | 10 ++++++++++
>>>>> target-ppc/cpu.h | 3 +++
>>>>> target-ppc/kvm.c | 2 ++
>>>>> vl.c | 4 ++++
>>>>> 6 files changed, 61 insertions(+)
>>>>>
>>>>> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
>>>>> index a09a1d9..737452d 100644
>>>>> --- a/hw/ppc/spapr.c
>>>>> +++ b/hw/ppc/spapr.c
>>>>> @@ -33,6 +33,7 @@
>>>>> #include "sysemu/kvm.h"
>>>>> #include "kvm_ppc.h"
>>>>> #include "mmu-hash64.h"
>>>>> +#include "cpu-models.h"
>>>>>
>>>>> #include "hw/boards.h"
>>>>> #include "hw/ppc/ppc.h"
>>>>> @@ -196,6 +197,26 @@ static XICSState *xics_system_init(int nr_servers,
>>>>> int nr_irqs)
>>>>> return icp;
>>>>> }
>>>>>
>>>>> +static void spapr_compat_mode_init(sPAPREnvironment *spapr)
>>>>> +{
>>>>> + QemuOpts *machine_opts = qemu_get_machine_opts();
>>>>> + uint64_t compat = qemu_opt_get_number(machine_opts, "compat", 0);
>>>>> +
>>>>> + switch (compat) {
>>>>> + case 0:
>>>>> + break;
>>>>> + case 205:
>>>>> + spapr->arch_compat = CPU_POWERPC_LOGICAL_2_05;
>>>>> + break;
>>>>> + case 206:
>>>>> + spapr->arch_compat = CPU_POWERPC_LOGICAL_2_06;
>>>> Does it make sense to declare compat mode a number or would a string
>>>> be easier for users? I can imagine that "-machine compat=power6" is
>>>> easier to understand for a user than "-machine compat=205".
>>> I just follow the PowerISA spec. It does not say anywhere (at least I do
>>> not see it) that 2.05==power6. 2.05 was released when power6 was
>>> released and power6 supports 2.05 but these are not synonims. And
>>> "compat=power6" would not set "cpu-version" to any of power6 PVRs, it
>>> still will be a logical PVR. It confuses me too to tell qemu "205"
>>> instead of "power6" but it is the spec to blame :)
>>
>> So what is "2_06 plus" then? :)
>
> No idea. Why does it matter here?
>
>
>> To me it really sounds like a 1:1 mapping to cores rather than specs - the
>> ISA defines a lot more capabilities than a single core necessarily
>> supports, especially with the inclusion of booke into the generic ppc spec.
>
>
> Sounds - may be. But still not the same. The guest kernel has different
> descriptors for power6(raw) and power6(architected) with different flags
> and (slightly?) different behavior.
So even the guest kernel calls it "power6" then? Why shouldn't we?
>
>
>>>>> + break;
>>>>> + default:
>>>>> + perror("Unsupported mode, only are 205, 206 supported\n");
>>>>> + break;
>>>>> + }
>>>>> +}
>>>>> +
>>>>> static int spapr_fixup_cpu_dt(void *fdt, sPAPREnvironment *spapr)
>>>>> {
>>>>> int ret = 0, offset;
>>>>> @@ -206,6 +227,7 @@ static int spapr_fixup_cpu_dt(void *fdt,
>>>>> sPAPREnvironment *spapr)
>>>>>
>>>>> CPU_FOREACH(cpu) {
>>>>> DeviceClass *dc = DEVICE_GET_CLASS(cpu);
>>>>> + CPUPPCState *env =&(POWERPC_CPU(cpu)->env);
>>>>> uint32_t associativity[] = {cpu_to_be32(0x5),
>>>>> cpu_to_be32(0x0),
>>>>> cpu_to_be32(0x0),
>>>>> @@ -238,6 +260,14 @@ static int spapr_fixup_cpu_dt(void *fdt,
>>>>> sPAPREnvironment *spapr)
>>>>> if (ret< 0) {
>>>>> return ret;
>>>>> }
>>>>> +
>>>>> + if (env->arch_compat) {
>>>>> + ret = fdt_setprop(fdt, offset, "cpu-version",
>>>>> +&env->arch_compat, sizeof(env->arch_compat));
>>>>> + if (ret< 0) {
>>>>> + return ret;
>>>>> + }
>>>>> + }
>>>>> }
>>>>> return ret;
>>>>> }
>>>>> @@ -1145,6 +1175,8 @@ static void ppc_spapr_init(QEMUMachineInitArgs
>>>>> *args)
>>>>> spapr = g_malloc0(sizeof(*spapr));
>>>>> QLIST_INIT(&spapr->phbs);
>>>>>
>>>>> + spapr_compat_mode_init(spapr);
>>>>> +
>>>>> cpu_ppc_hypercall = emulate_spapr_hypercall;
>>>>>
>>>>> /* Allocate RMA if necessary */
>>>>> @@ -1226,6 +1258,14 @@ static void ppc_spapr_init(QEMUMachineInitArgs
>>>>> *args)
>>>>>
>>>>> xics_cpu_setup(spapr->icp, cpu);
>>>>>
>>>>> + /*
>>>>> + * If compat mode is set in the command line, pass it to CPU
>>>>> so KVM
>>>>> + * will be able to set it in the host kernel.
>>>>> + */
>>>>> + if (spapr->arch_compat) {
>>>>> + env->arch_compat = spapr->arch_compat;
>>>> You should set the compat mode in KVM here, rather than doing it in
>>> the put_registers call which gets invoked on every register sync. Or can
>>> the guest change the mode?
>>>
>>>
>>> I will change it here in the next patch (which requires kernel changes
>>> which are not there yet). The guest cannot change it directly but it can
>>> indirectly via "client-architecture-support".
>>
>> They probably want a generic callback then.
>
>
> "They"? What callback on what? :) PCR change is privileged instruction so
> the guest is not supposed to change it directly and AFAIK (sorry for my
> ignorance) "ibm,client-architecture-support" RTAS call is the only way to
> set it and it is spapr-only. How do other PPC machines do that change?
No idea what I was referring to here. Just call a generic helper that changes
the value of PCR in the QEMU SPR array as well as KVM.
>
>> What happens on reset?
>
>
> PCR has to be reset I believe.
That needs to be modeled manually somewhere. Please make sure to do this.
>
>
>>>> Also, we need to handle failure. If the kernel can not set the CPU
>>>> to
>>> 2.05 mode for example (IIRC POWER8 doesn't allow you to) we should bail
>>> out here.
>>>
>>> Yep, I'll add this easy check :)
>>>
>>>> And then there's the TCG question. We either have to disable CPU
>>> features similar to how we handle it in KVM (by setting and honoring the
>>> respective bits in PCR) or we need to bail out too and declare compat
>>> mode unsupported for TCG.
>>>
>>> At the moment we want to run old distro on new CPUs. This patch would
>>> make more sense with the "ibm,client-architecture-support" and "power8
>>> registers migration" patches which I did not post yet.
>>>
>>> Disabling "unsupported" bits in TCG would be really nice indeed and we
>>> should eventually do this but 1) it will not really hurt the guest if we
>>> did not disable some feature (really old guest will not use it and
>>> that's it) 2) once created, PowerPCCPUState (or whatever the exact name
>>> is, I am writing this mail on windows machine :) ) is not very flexible
>>> to reconfigure...
>>
>> Can't you just set the bits in PCR and add an XXX comment indicating that
>> we're currently not honoring them? Then fron the machine code point of
>> view, everything is implemented.
>
>
> Is not it what the patch does at the moment to TCG (except missing comment)?
You're modifying env->arch_compat. You should be modifying env->sprs[SPRN_PCR].
>
>
>
>>>> And then there's the fact that the kernel interface isn't upstream in a
>>>> way that
>>> ? unfinished sentence? What is missing in the kernel right now?
>>
>> Eh. I think I wanted to say that this depends on in-kernel patches that are
>> not upstream yet :).
>>
>>>
>>>
>>>>> + }
>>>>> +
>>>>> qemu_register_reset(spapr_cpu_reset, cpu);
>>>>> }
>>>>>
>>>>> diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
>>>>> index ca175b0..201c578 100644
>>>>> --- a/include/hw/ppc/spapr.h
>>>>> +++ b/include/hw/ppc/spapr.h
>>>>> @@ -34,6 +34,8 @@ typedef struct sPAPREnvironment {
>>>>> uint32_t epow_irq;
>>>>> Notifier epow_notifier;
>>>>>
>>>>> + uint32_t arch_compat; /* Compatible PVR from the command
>>>>> line */
>>>>> +
>>>>> /* Migration state */
>>>>> int htab_save_index;
>>>>> bool htab_first_pass;
>>>>> diff --git a/target-ppc/cpu-models.h b/target-ppc/cpu-models.h
>>>>> index 49ba4a4..d7c033c 100644
>>>>> --- a/target-ppc/cpu-models.h
>>>>> +++ b/target-ppc/cpu-models.h
>>>>> @@ -583,6 +583,16 @@ enum {
>>>>> CPU_POWERPC_RS64II = 0x00340000,
>>>>> CPU_POWERPC_RS64III = 0x00360000,
>>>>> CPU_POWERPC_RS64IV = 0x00370000,
>>>>> +
>>>>> + /* Logical CPUs */
>>>>> + CPU_POWERPC_LOGICAL_MASK = 0xFFFFFFFF,
>>>>> + CPU_POWERPC_LOGICAL_2_04 = 0x0F000001,
>>>>> + CPU_POWERPC_LOGICAL_2_05 = 0x0F000002,
>>>>> + CPU_POWERPC_LOGICAL_2_06 = 0x0F000003,
>>>>> + CPU_POWERPC_LOGICAL_2_06_PLUS = 0x0F100003,
>>>>> + CPU_POWERPC_LOGICAL_2_07 = 0x0F000004,
>>>>> + CPU_POWERPC_LOGICAL_2_08 = 0x0F000005,
>>>> These don't really belong here.
>>>
>>> Sorry, I do not understand. These are PVRs which are used in
>>> "cpu-version" DT property. They are logical PVRs but still PVRs - i.e.
>>> the guest has PVR masks for them too.
>>
>> But they are specific to PAPR, not PowerPC in general, no? And you would
>> never find one in the PVR register of a guest.
>
>
> Yes, never. But they all are in the same array in
> arch/powerpc/kernel/cputable.c. How is that different?
They are not CPUs. That's what's different. These numbers don't exist on any
real silicon and will never get returned my mfpvr.
>
>
>>>>> +
>>>>> #endif /* defined(TARGET_PPC64) */
>>>>> /* Original POWER */
>>>>> /* XXX: should be POWER (RIOS), RSC3308, RSC4608,
>>>>> diff --git a/target-ppc/cpu.h b/target-ppc/cpu.h
>>>>> index 422a6bb..fc837c1 100644
>>>>> --- a/target-ppc/cpu.h
>>>>> +++ b/target-ppc/cpu.h
>>>>> @@ -999,6 +999,9 @@ struct CPUPPCState {
>>>>> /* Device control registers */
>>>>> ppc_dcr_t *dcr_env;
>>>>>
>>>>> + /* Architecture compatibility mode */
>>>>> + uint32_t arch_compat;
>>>
>>>> Do we really need to carry this in the vcpu struct? Or can we just
>>>> fire-and-forget about it? If we want to preserve anything, it should
>>>> be the PCR register.
>>> This is the current PVR value aka "cpu-version" property. It may change
>>> during reboots as every reboot does the "client-architecture-support"
>>> call so the logical PVR can change.
>>
>> Ok, so again: What happens on reset / reboot? Do we want to preserve the
>> compatibility setting or does that get reset as well?
>
> The setting must be preserved on reboot ("client-architecture-support"
> definitely expects it to be preserved). I am not sure about reset, I guess
> it can/should be reset.
Please find out for sure. It's critical we get this right and model it
accordingly.
Alex