qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v7 00/13] APIC ID fixes for AMD EPYC CPU model


From: Igor Mammedov
Subject: Re: [PATCH v7 00/13] APIC ID fixes for AMD EPYC CPU model
Date: Wed, 18 Mar 2020 11:47:14 +0100

On Wed, 18 Mar 2020 02:43:57 +0000
"Moger, Babu" <address@hidden> wrote:

> [AMD Official Use Only - Internal Distribution Only]
> 
> 
> 
> > -----Original Message-----
> > From: Eduardo Habkost <address@hidden>
> > Sent: Tuesday, March 17, 2020 6:46 PM
> > To: Moger, Babu <address@hidden>
> > Cc: address@hidden; address@hidden; address@hidden;
> > address@hidden; address@hidden; address@hidden
> > Subject: Re: [PATCH v7 00/13] APIC ID fixes for AMD EPYC CPU model
> > 
> > On Tue, Mar 17, 2020 at 07:22:06PM -0400, Eduardo Habkost wrote:  
> > > On Thu, Mar 12, 2020 at 11:28:47AM -0500, Babu Moger wrote:  
> > > > Eduardo, Can you please queue the series if there are no concerns.
> > > > Thanks  
> > >
> > > I had queued it for today's pull request, but it looks like it
> > > breaks "make check".  See  
> > https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Ftravis-
> > ci.org%2Fgithub%2Fehabkost%2Fqemu%2Fjobs%2F663529282&amp;data=02%7
> > C01%7Cbabu.moger%40amd.com%7C43bba959c4d34e3be5fd08d7cacd634d%7
> > C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637200855817408351&
> > amp;sdata=cfjMVDKMgByvtUIqqGtcjNWGAf3PKFKxDLaS1eVME3U%3D&amp;re
> > served=0  
> > >
> > >   PASS 4 bios-tables-test /x86_64/acpi/piix4/ipmi
> > >   Could not access KVM kernel module: No such file or directory
> > >   qemu-system-x86_64: -accel kvm: failed to initialize kvm: No such file 
> > > or  
> > directory  
> > >   qemu-system-x86_64: falling back to tcg
> > >   qemu-system-x86_64: Invalid CPU [socket: 0, die: 0, core: 1, thread: 0] 
> > > with  
> > APIC ID 1, valid index range 0:5  
> > >   Broken pipe
> > >   /home/travis/build/ehabkost/qemu/tests/qtest/libqtest.c:166: 
> > > kill_qemu()  
> > tried to terminate QEMU process but encountered exit status 1 (expected 0)  
> > >   Aborted (core dumped)
> > >   ERROR - too few tests run (expected 17, got 4)
> > >   /home/travis/build/ehabkost/qemu/tests/Makefile.include:633: recipe for 
> > >  
> > target 'check-qtest-x86_64' failed  
> > >   make: *** [check-qtest-x86_64] Error 1  
> > 
> > Failure is at the /x86_64/acpi/piix4/cpuhp test case:
> > 
> >   $ QTEST_QEMU_BINARY=x86_64-softmmu/qemu-system-x86_64
> > QTEST_QEMU_IMG=qemu-img tests/qtest/bios-tables-test -m=quick --verbose -
> > -debug-log
> >   [...]
> >   {*LOG(start):{/x86_64/acpi/piix4/cpuhp}:LOG*}
> >   # starting QEMU: exec x86_64-softmmu/qemu-system-x86_64 -qtest
> > unix:/tmp/qtest-2052313.sock -qtest-log /dev/null -chardev
> > socket,path=/tmp/qtest-2052313.qmp,id=char0 -mon
> > chardev=char0,mode=control -display none -machine pc,kernel-irqchip=off -
> > accel kvm -accel tcg -net none -display none -smp
> > 2,cores=3,sockets=2,maxcpus=6 -object memory-backend-
> > ram,id=ram0,size=64M -object memory-backend-ram,id=ram1,size=64M -numa
> > node,memdev=ram0 -numa node,memdev=ram1 -numa dist,src=0,dst=1,val=21
> > -drive id=hd0,if=none,file=tests/acpi-test-disk-PVjFru,format=raw -device 
> > ide-
> > hd,drive=hd0  -accel qtest
> >   {*LOG(message):{starting QEMU: exec x86_64-softmmu/qemu-system-x86_64
> > -qtest unix:/tmp/qtest-2052313.sock -qtest-log /dev/null -chardev
> > socket,path=/tmp/qtest-2052313.qmp,id=char0 -mon
> > chardev=char0,mode=control -display none -machine pc,kernel-irqchip=off -
> > accel kvm -accel tcg -net none -display none -smp
> > 2,cores=3,sockets=2,maxcpus=6 -object memory-backend-
> > ram,id=ram0,size=64M -object memory-backend-ram,id=ram1,size=64M -numa
> > node,memdev=ram0 -numa node,memdev=ram1 -numa dist,src=0,dst=1,val=21
> > -drive id=hd0,if=none,file=tests/acpi-test-disk-PVjFru,format=raw -device 
> > ide-
> > hd,drive=hd0  -accel qtest}:LOG*}
> >   qemu-system-x86_64: Invalid CPU [socket: 0, die: 0, core: 1, thread: 0] 
> > with
> > APIC ID 1, valid index range 0:5
> >   Broken pipe  
> 
> The ms->smp.cpus Is not initialized to max cpus in this case. Looks like 
> smp_parse did not run in this path.
> For that reason the apicid is not initialized for all the cpus. Following 
> patch fixes the problem.
> I will test all the combinations and send the patch tomorrow. Let me know 
> which tree I should use the to
> generate the patch. It appears some patches are already pulled. I can send 
> top of
>  git://github.com/ehabkost/qemu.git (x86-next).
> 
> diff --git a/hw/i386/x86.c b/hw/i386/x86.c
> index 023dce1dbd..1eeb7b9732 100644
> --- a/hw/i386/x86.c
> +++ b/hw/i386/x86.c
> @@ -156,7 +156,7 @@ void x86_cpus_init(X86MachineState *x86ms, int 
> default_cpu_version)
>                                                        ms->smp.max_cpus - 1) 
> + 1;
>      possible_cpus = mc->possible_cpu_arch_ids(ms);
> 
> -    for (i = 0; i < ms->smp.cpus; i++) {
> +    for (i = 0; i < ms->possible_cpus->len; i++) {
>          ms->possible_cpus->cpus[i].arch_id =
>              x86_cpu_apic_id_from_index(x86ms, i);
>      }

indeed, it should use possible_cpus->len instead of initial cpus number

> > 
> >   
> > >
> > >  
> > > >
> > > > On 3/11/20 5:52 PM, Babu Moger wrote:  
> > > > > This series fixes APIC ID encoding problem reported on AMD EPYC cpu  
> > models.  
> > > > >  
> > https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fbugzilla.
> > redhat.com%2Fshow_bug.cgi%3Fid%3D1728166&amp;data=02%7C01%7Cbabu.
> > moger%40amd.com%7C43bba959c4d34e3be5fd08d7cacd634d%7C3dd8961fe4
> > 884e608e11a82d994e183d%7C0%7C0%7C637200855817408351&amp;sdata=m
> > E%2FiWq9sB2Jp9GtQesFZtU2lGT4MU6IVgm7HxhyfO9w%3D&amp;reserved=0  
> > > > >
> > > > > Currently, the APIC ID is decoded based on the sequence
> > > > > sockets->dies->cores->threads. This works for most standard AMD and  
> > other  
> > > > > vendors' configurations, but this decoding sequence does not follow 
> > > > > that  
> > of  
> > > > > AMD's APIC ID enumeration strictly. In some cases this can cause CPU  
> > topology  
> > > > > inconsistency.  When booting a guest VM, the kernel tries to validate 
> > > > > the
> > > > > topology, and finds it inconsistent with the enumeration of EPYC cpu  
> > models.  
> > > > >
> > > > > To fix the problem we need to build the topology as per the Processor
> > > > > Programming Reference (PPR) for AMD Family 17h Model 01h, Revision B1
> > > > > Processors. The documentation is available from the bugzilla Link 
> > > > > below.
> > > > >
> > > > > Link:  
> > https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fbugzilla.
> > kernel.org%2Fshow_bug.cgi%3Fid%3D206537&amp;data=02%7C01%7Cbabu.m
> > oger%40amd.com%7C43bba959c4d34e3be5fd08d7cacd634d%7C3dd8961fe488
> > 4e608e11a82d994e183d%7C0%7C0%7C637200855817408351&amp;sdata=BH1
> > L3fcVzZdjo2zU3TclzJzZKJq%2BxpT3P%2FJwZXvs6Pc%3D&amp;reserved=0  
> > > > >
> > > > > Here is the text from the PPR.
> > > > > Operating systems are expected to use  
> > Core::X86::Cpuid::SizeId[ApicIdSize], the  
> > > > > number of least significant bits in the Initial APIC ID that indicate 
> > > > > core ID
> > > > > within a processor, in constructing per-core CPUID masks.
> > > > > Core::X86::Cpuid::SizeId[ApicIdSize] determines the maximum number of 
> > > > >  
> > cores  
> > > > > (MNC) that the processor could theoretically support, not the actual  
> > number of  
> > > > > cores that are actually implemented or enabled on the processor, as  
> > indicated  
> > > > > by Core::X86::Cpuid::SizeId[NC].
> > > > > Each Core::X86::Apic::ApicId[ApicId] register is preset as follows:
> > > > > • ApicId[6] = Socket ID.
> > > > > • ApicId[5:4] = Node ID.
> > > > > • ApicId[3] = Logical CCX L3 complex ID
> > > > > • ApicId[2:0]= (SMT) ? {LogicalCoreID[1:0],ThreadId} :  
> > {1'b0,LogicalCoreID[1:0]}  
> > > > >
> > > > > v7:
> > > > >  Generated the patches on top of git://github.com/ehabkost/qemu.git  
> > (x86-next).  
> > > > >  Changes from v6.
> > > > >  1. Added new function x86_set_epyc_topo_handlers to override the 
> > > > > apic  
> > id  
> > > > >     encoding handlers.
> > > > >  2. Separated the code to set use_epyc_apic_id_encoding and added as 
> > > > > a  
> > new patch  
> > > > >     as it looked more logical.
> > > > >  3. Fixed minor typos.
> > > > >
> > > > > v6:
> > > > >  
> > https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.ker
> > nel.org%2Fqemu-
> > devel%2F158389385028.22020.7608244627303132902.stgit%40naples-
> > babu.amd.com%2F&amp;data=02%7C01%7Cbabu.moger%40amd.com%7C43bb
> > a959c4d34e3be5fd08d7cacd634d%7C3dd8961fe4884e608e11a82d994e183d%7
> > C0%7C0%7C637200855817408351&amp;sdata=7BZjkRROVX9M5nW1RmQYnITY
> > fndrgR1jcHSWQGLSYco%3D&amp;reserved=0  
> > > > >  Generated the patches on top of git://github.com/ehabkost/qemu.git  
> > (x86-next).  
> > > > >  Changes from v5.
> > > > >  1. Eduardo has already queued couple of patches, submitting the rest 
> > > > >  
> > here.  
> > > > >  2. Major change is how the EPYC mode apic id encoding handlers are  
> > loaded.  
> > > > >     Added a boolean variable use_epyc_apic_id_encoding in  
> > X86CPUDefinition.  
> > > > >     The variable is will be used to tell if we need to use EPYC mode 
> > > > > encoding.
> > > > >  3. Eduardo reported bysectability problem with x86 unit test code.
> > > > >     Quashed the patches in 1 and 2 to resolve it. Problem was change 
> > > > > in  
> > calling  
> > > > >     conventions of topology related functions.
> > > > >  4. Also set the use_epyc_apic_id_encoding for EPYC-Rome. This model 
> > > > > is
> > > > >     added recently to the cpu table.
> > > > >
> > > > > v5:
> > > > >  
> > https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.ker
> > nel.org%2Fqemu-
> > devel%2F158326531474.40452.11433722850425537745.stgit%40naples-
> > babu.amd.com%2F&amp;data=02%7C01%7Cbabu.moger%40amd.com%7C43bb
> > a959c4d34e3be5fd08d7cacd634d%7C3dd8961fe4884e608e11a82d994e183d%7
> > C0%7C0%7C637200855817413332&amp;sdata=rVqY3p6vUGeEu%2FbHTfE%2FfI
> > gTtp0vuxzrE1egl5%2FYsGQ%3D&amp;reserved=0  
> > > > >  Generated the patches on top of git://github.com/ehabkost/qemu.git  
> > (x86-next).  
> > > > >  Changes from v4.
> > > > >  1. Re-arranged the patches 2 and 4 as suggested by Igor.
> > > > >  2. Kept the apicid handler functions inside X86MachineState as 
> > > > > discussed.
> > > > >     These handlers are loaded from X86CPUDefinitions.
> > > > >  3. Removed unnecessary X86CPUstate initialization from x86_cpu_new.  
> > Suggested  
> > > > >     by Igor.
> > > > >  4. And other minor changes related to patch format.
> > > > >
> > > > > v4:
> > > > >  
> > https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.ker
> > nel.org%2Fqemu-
> > devel%2F158161767653.48948.10578064482878399556.stgit%40naples-
> > babu.amd.com%2F&amp;data=02%7C01%7Cbabu.moger%40amd.com%7C43bb
> > a959c4d34e3be5fd08d7cacd634d%7C3dd8961fe4884e608e11a82d994e183d%7
> > C0%7C0%7C637200855817413332&amp;sdata=pyd8T6rE%2BbR3FZf2c4cdtLr%2
> > Fbxz%2FgW%2FWrap14mMt7To%3D&amp;reserved=0  
> > > > >  Changes from v3.
> > > > >  1. Moved the arch_id calculation inside the function x86_cpus_init. 
> > > > > With  
> > this change,  
> > > > >     we dont need to change common numa code.(suggested by Igor)
> > > > >  2. Introduced the model specific handlers inside X86CPUDefinitions.
> > > > >     These handlers are loaded into X86MachineState during the init.
> > > > >  3. Removed llc_id from x86CPU.
> > > > >  4. Removed init_apicid_fn hanlder from MachineClass. Kept all the 
> > > > > code  
> > changes  
> > > > >     inside the x86.
> > > > >  5. Added new handler function apicid_pkg_offset for pkg_offset  
> > calculation.  
> > > > >  6. And some Other minor changes.
> > > > >
> > > > > v3:
> > > > >  
> > https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.ker
> > nel.org%2Fqemu-
> > devel%2F157541968844.46157.17994918142533791313.stgit%40naples-
> > babu.amd.com%2F&amp;data=02%7C01%7Cbabu.moger%40amd.com%7C43bb
> > a959c4d34e3be5fd08d7cacd634d%7C3dd8961fe4884e608e11a82d994e183d%7
> > C0%7C0%7C637200855817413332&amp;sdata=OM5sjNorayyzETuwa4FBPSLMb
> > XtpbeXEG0AxRotIcXA%3D&amp;reserved=0  
> > > > >   1. Consolidated the topology information in structure 
> > > > > X86CPUTopoInfo.
> > > > >   2. Changed the ccx_id to llc_id as commented by upstream.
> > > > >   3. Generalized the apic id decoding. It is mostly similar to 
> > > > > current apic id
> > > > >      except that it adds new field llc_id when numa configured. 
> > > > > Removes all  
> > the  
> > > > >      hardcoded values.
> > > > >   4. Removed the earlier parse_numa split. And moved the numa node  
> > initialization  
> > > > >      inside the numa_complete_configuration. This is bit cleaner as  
> > commented by  
> > > > >      Eduardo.
> > > > >   5. Added new function init_apicid_fn inside machine_class 
> > > > > structure. This
> > > > >      will be used to update the apic id handler specific to cpu model.
> > > > >   6. Updated the cpuid unit tests.
> > > > >   7. TODO : Need to figure out how to dynamically update the handlers 
> > > > >  
> > using cpu models.  
> > > > >      I might some guidance on that.
> > > > >
> > > > > v2:
> > > > >  
> > https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.ker
> > nel.org%2Fqemu-
> > devel%2F156779689013.21957.1631551572950676212.stgit%40localhost.locald
> > omain%2F&amp;data=02%7C01%7Cbabu.moger%40amd.com%7C43bba959c4d
> > 34e3be5fd08d7cacd634d%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0
> > %7C637200855817413332&amp;sdata=HhZyaoTcB93BX2wFBQ46QXkLZywhIeq
> > Rt9FkiqUmwaI%3D&amp;reserved=0  
> > > > >   1. Introduced the new property epyc to enable new epyc mode.
> > > > >   2. Separated the epyc mode and non epyc mode function.
> > > > >   3. Introduced function pointers in PCMachineState to handle the
> > > > >      differences.
> > > > >   4. Mildly tested different combinations to make things are working 
> > > > > as  
> > expected.  
> > > > >   5. TODO : Setting the epyc feature bit needs to be worked out. This 
> > > > >  
> > feature is  
> > > > >      supported only on AMD EPYC models. I may need some guidance on  
> > that.  
> > > > >
> > > > > v1:
> > > > >  
> > https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.ker
> > nel.org%2Fqemu-devel%2F20190731232032.51786-1-
> > babu.moger%40amd.com%2F&amp;data=02%7C01%7Cbabu.moger%40amd.co
> > m%7C43bba959c4d34e3be5fd08d7cacd634d%7C3dd8961fe4884e608e11a82d9
> > 94e183d%7C0%7C0%7C637200855817413332&amp;sdata=BbMBhRE5C5lfdtc%
> > 2FepDHzz2aOwhVbX7uDfUu737LNjA%3D&amp;reserved=0  
> > > > > ---
> > > > >
> > > > > Babu Moger (13):
> > > > >       hw/i386: Introduce X86CPUTopoInfo to contain topology info
> > > > >       hw/i386: Consolidate topology functions
> > > > >       machine: Add SMP Sockets in CpuTopology
> > > > >       hw/i386: Remove unnecessary initialization in x86_cpu_new
> > > > >       hw/i386: Update structures to save the number of nodes per 
> > > > > package
> > > > >       hw/i386: Rename apicid_from_topo_ids to x86_apicid_from_topo_ids
> > > > >       hw/386: Add EPYC mode topology decoding functions
> > > > >       target/i386: Cleanup and use the EPYC mode topology functions
> > > > >       hw/i386: Introduce apicid functions inside X86MachineState
> > > > >       i386: Introduce use_epyc_apic_id_encoding in X86CPUDefinition
> > > > >       hw/i386: Move arch_id decode inside x86_cpus_init
> > > > >       target/i386: Enable new apic id encoding for EPYC based cpus 
> > > > > models
> > > > >       i386: Fix pkg_id offset for EPYC cpu models
> > > > >
> > > > >
> > > > >  hw/core/machine.c          |    1
> > > > >  hw/i386/pc.c               |   15 ++-
> > > > >  hw/i386/x86.c              |   73 ++++++++++++----
> > > > >  include/hw/boards.h        |    2
> > > > >  include/hw/i386/topology.h |  195 ++++++++++++++++++++++++++++++---  
> > ---------  
> > > > >  include/hw/i386/x86.h      |   12 +++
> > > > >  softmmu/vl.c               |    1
> > > > >  target/i386/cpu.c          |  203 
> > > > > ++++++++++++++------------------------------
> > > > >  target/i386/cpu.h          |    3 +
> > > > >  tests/test-x86-cpuid.c     |  116 +++++++++++++++----------
> > > > >  10 files changed, 358 insertions(+), 263 deletions(-)
> > > > >
> > > > > --
> > > > > Signature
> > > > >  
> > > >  
> > >
> > > --
> > > Eduardo  
> > 
> > --
> > Eduardo  




reply via email to

[Prev in Thread] Current Thread [Next in Thread]