qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH 8/9] spapr: Implement processor compatibility in


From: Alexey Kardashevskiy
Subject: Re: [Qemu-devel] [PATCH 8/9] spapr: Implement processor compatibility in ibm, client-architecture-support
Date: Mon, 19 May 2014 13:09:12 +1000
User-agent: Mozilla/5.0 (X11; Linux i686 on x86_64; rv:24.0) Gecko/20100101 Thunderbird/24.5.0

On 05/17/2014 11:45 AM, Alexey Kardashevskiy wrote:
> On 05/17/2014 06:46 AM, Alexander Graf wrote:
>>
>> On 16.05.14 17:57, Alexey Kardashevskiy wrote:
>>> On 05/17/2014 12:16 AM, Alexander Graf wrote:
>>>> On 15.05.14 13:28, Alexey Kardashevskiy wrote:
>>>>> Modern Linux kernels support last POWERPC CPUs so when a kernel boots,
>>>>> in most cases it can find a matching cpu_spec in the kernel's cpu_specs
>>>>> list. However if the kernel is quite old, it may be missing a definition
>>>>> of the actual CPU. To provide an ability for old kernels to work on modern
>>>>> hardware, a Processor Compatibility Mode has been introduced
>>>>> by the PowerISA specification.
>>>>>
>>>>>   From the hardware prospective, it is supported by the Processor
>>>>> Compatibility Register (PCR) which is defined in PowerISA. The register
>>>>> enables one of the compatibility modes (2.05/2.06/2.07).
>>>>> Since PCR is a hypervisor privileged register and cannot be
>>>>> accessed from the guest, the mode selection is done via
>>>>> ibm,client-architecture-support (CAS) RTAS call using which the guest
>>>>> specifies what "raw" and "architected" CPU versions it supports.
>>>>> QEMU works out the best match, changes a "cpu-version" property of
>>>>> every CPU and notifies the guest about the change by setting these
>>>>> properties in the buffer passed as a response on a custom H_CAS hypercall.
>>>>>
>>>>> Signed-off-by: Alexey Kardashevskiy <address@hidden>
>>>>> ---
>>>>>    hw/ppc/spapr.c       | 40 +++++++++++++++++++++++++
>>>>>    hw/ppc/spapr_hcall.c | 83
>>>>> ++++++++++++++++++++++++++++++++++++++++++++++++++++
>>>>>    trace-events         |  5 ++++
>>>>>    3 files changed, 128 insertions(+)
>>>>>
>>>>> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
>>>>> index a2c9106..a0882a1 100644
>>>>> --- a/hw/ppc/spapr.c
>>>>> +++ b/hw/ppc/spapr.c
>>>>> @@ -35,6 +35,7 @@
>>>>>    #include "kvm_ppc.h"
>>>>>    #include "mmu-hash64.h"
>>>>>    #include "cpu-models.h"
>>>>> +#include "qom/cpu.h"
>>>>>      #include "hw/boards.h"
>>>>>    #include "hw/ppc/ppc.h"
>>>>> @@ -592,11 +593,50 @@ int spapr_h_cas_compose_response(target_ulong addr,
>>>>> target_ulong size)
>>>>>    {
>>>>>        void *fdt;
>>>>>        sPAPRDeviceTreeUpdateHeader hdr = { .version_id = 1 };
>>>>> +    CPUState *cs;
>>>>> +    int smt = kvmppc_smt_threads();
>>>>>          size -= sizeof(hdr);
>>>>>          fdt = g_malloc0(size);
>>>>>        _FDT((fdt_create(fdt, size)));
>>>>> +    _FDT((fdt_begin_node(fdt, "cpus")));
>>>>> +
>>>>> +    CPU_FOREACH(cs) {
>>>>> +        PowerPCCPU *cpu = POWERPC_CPU(cs);
>>>>> +        DeviceClass *dc = DEVICE_GET_CLASS(cpu);
>>>>> +        int smpt = spapr_get_compat_smp_threads(cpu);
>>>>> +        int i, index = ppc_get_vcpu_dt_id(cpu);
>>>>> +        uint32_t servers_prop[smpt];
>>>>> +        uint32_t gservers_prop[smpt * 2];
>>>>> +        char tmp[32];
>>>>> +
>>>>> +        if ((index % smt) != 0) {
>>>>> +            continue;
>>>>> +        }
>>>>> +
>>>>> +        snprintf(tmp, 32, "address@hidden", dc->fw_name, index);
>>>>> +        trace_spapr_cas_add_subnode(tmp);
>>>>> +
>>>>> +        _FDT((fdt_begin_node(fdt, tmp)));
>>>>> +        _FDT((fdt_property_cell(fdt, "cpu-version", cpu->cpu_version)));
>>>>> +
>>>>> +        /* Build interrupt servers and gservers properties */
>>>>> +        for (i = 0; i < smpt; i++) {
>>>>> +            servers_prop[i] = cpu_to_be32(index + i);
>>>>> +            /* Hack, direct the group queues back to cpu 0 */
>>>>> +            gservers_prop[i*2] = cpu_to_be32(index + i);
>>>>> +            gservers_prop[i*2 + 1] = 0;
>>>>> +        }
>>>>> +        _FDT((fdt_property(fdt, "ibm,ppc-interrupt-server#s",
>>>>> +                           servers_prop, sizeof(servers_prop))));
>>>>> +        _FDT((fdt_property(fdt, "ibm,ppc-interrupt-gserver#s",
>>>>> +                           gservers_prop, sizeof(gservers_prop))));
>>>>> +
>>>>> +        _FDT((fdt_end_node(fdt)));
>>>>> +    }
>>>>> +
>>>>> +    _FDT((fdt_end_node(fdt)));
>>>> Why is this so much code? Can we only replace full nodes?
>>> It is a diff tree, in only has properties to change.
>>
>> So why do we change so much then? :)
> 
> 
> 3 (three) properties are "much" now? :)
> 
>>>> Then please
>>>> extract the original CPU node creation into a function and just call it
>>>> from the original generation and from here.
>>>>
>>>> If we can also replace single properties I'd say we only need to override
>>>> cpu-version.
>>>
>>> Mmm. The user could run qemu with threads=8 on POWER8 with SLES11 which
>>> must not see 8 threads but it can update itself and reboot with the kernel
>>> which is aware of POWER8. You are saying we do not want to support that?
>>
>> First off, I think that practically speaking people will want to use
>> 2-thread granularity on POWER8 because that gives them the best load
>> exploitation on the system.
> 
> Practically speaking, people will want to use kernels which can work on
> POWER8 in raw mode, no? Here we are trying to fix older guests.
> 
>> So if we just disable CPU nodes that would not be visible usually (there
>> was a disabled flag in cpu nodes, right?) for threads that are incompatible
>> with a new CAS mode, we can then remove the disabled flag when the guest
>> accepts more threads again.

I just realized where misunderstanding happened - when I switch from 4
threads per core to 2 threads per core, the number of CPU nodes does not
change, only server#s and gserver#s properties do and this is the only way
of telling the guest the new number of threads.

So if you still want to disable some nodes, please explain more. Thanks!


>> That means we could expose 8 threads (POWER8), guest only supports 2
>> threads (POWER6), so we waste 6 threads. But the remaining 2 will be fast
>> ;). We basically degrade the system to SMT2 mode.
> 
> We do not want to show more threads (i.e. bigger ibm,ppc-interrupt-server#s
> properties) if we do not support more threads as we do not really know what
> guest kernel (including AIX) or some ppc64_cpu/drmgr/whatever tool will do
> if they find more.
> 
>> What does pHyp do here?
> 
> I wish it was easy to verify :)
> 
> I fail to see what exactly your objections are all about, I definitely miss
> something big here. Is it a minor code duplication or some dangerous bug?
> If it is just a code duplication, I could do something about it.
> 
> 


-- 
Alexey



reply via email to

[Prev in Thread] Current Thread [Next in Thread]