qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Bug 1871842] [NEW] AMD CPUID leaf 0x8000'0008 reported number of co


From: Philipp Eppelt
Subject: Re: [Bug 1871842] [NEW] AMD CPUID leaf 0x8000'0008 reported number of cores inconsistent with ACPI.MADT
Date: Tue, 14 Apr 2020 08:24:52 -0000

Hi,

thanks for looking into this so quickly.

With this patch applied ontop of git commit
f3bac27cc1e303e1860cc55b9b6889ba39dee587 I still have the issue and it
reports the same numbers. I like the new usage of the ApicIdSize field.


I looked into the mentioned pc_smp_parse() and had it print the topology
for -smp 4:

qemu-system-x86_64: warning: cpu topology: sockets (4) , dies (1) ,
cores (1) , threads (1) , maxcpus (4), cpus (4)

and with -smp 4,cores=4:

qemu-system-x86_64: warning: cpu topology: sockets (1) , dies (1) ,
cores (4) , threads (1) , maxcpus (4), cpus (4)

As far as I understand it, these are the numbers the cpuid:8000'0008
code relies on:
`cs->nr_cores`, `cs->nr_threads` with `cs` being of type CPUState.

So I think the issue is rooted with the preferring sockets over cores
when the -smp cmdline option is parsed, as stated in hw/i386/pc.c:729.

I guess this is the same code for Intel and AMD CPUs alike and this
issue just didn't surface for us on Intel CPUs, as they don't have this
CPUID leaf and we don't look at the topology.

This seems to boil down to a more careful use of the -smp option on my
end.

Thanks again for looking into this.

Cheers,
Philipp


On 4/10/20 2:12 AM, Babu Moger wrote:
> Philipp,
>   Can you please check if this patch works for you.
> 
> diff --git a/target/i386/cpu.c b/target/i386/cpu.c
> index 90ffc5f..e467fee 100644
> --- a/target/i386/cpu.c
> +++ b/target/i386/cpu.c
> @@ -5831,10 +5831,17 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t
> index, uint32_t count,
>          }
>          *ebx = env->features[FEAT_8000_0008_EBX];
>          *ecx = 0;
> -        *edx = 0;
>          if (cs->nr_cores * cs->nr_threads > 1) {
> -            *ecx |= (cs->nr_cores * cs->nr_threads) - 1;
> +            unsigned long max_apicids, bits_required;
> +
> +            max_apicids = (cs->nr_cores * cs->nr_threads) - 1;
> +            if (max_apicids) {
> +                /* Find out the number of bits to represent all the
> apicids */
> +                bits_required = find_last_bit(&max_apicids,
> BITS_PER_BYTE) + 1;
> +                *ecx |= bits_required << 12 | max_apicids;
> +            }
>          }
> +        *edx = 0;
>          break;
>      case 0x8000000A:
>          if (env->features[FEAT_8000_0001_ECX] & CPUID_EXT3_SVM) {
> 
> 
> On 4/9/20 9:00 AM, Igor Mammedov wrote:
>> On Thu, 09 Apr 2020 12:58:11 -0000
>> Philipp Eppelt <address@hidden> wrote:
>>
>>> Public bug reported:
>>>
>>> Setup:
>>> CPU: AMD EPYC-v2 or host's EPYC cpu
>>> Linux 64-bit fedora host; Kernel version 5.5.15-200.fc31
>>> qemu version: self build
>>> git-head: f3bac27cc1e303e1860cc55b9b6889ba39dee587
>>> config: Configured with: '../configure' 
>>> '--target-list=x86_64-softmmu,mips64el-softmmu,mips64-softmmu,mipsel-softmmu,mips-softmmu,i386-softmmu,aarch64-softmmu,arm-softmmu'
>>>  '--prefix=/opt/qemu-master'
>>>
>>> Cmdline: 
>>> qemu-system-x86_64 -kernel 
>>> /home/peppelt/code/l4/internal/.build-x86_64/bin/amd64_gen/bootstrap 
>>> -append "" -initrd "./fiasco/.build-x86_64/fiasco , ... " -serial stdio 
>>> -nographic -monitor none -nographic -monitor none -cpu EPYC-v2 -m 4G -smp 4 
>>>
>>> Issue:
>>> We are developing an microkernel operating system called L4Re. We recently 
>>> got an AMD EPYC server for testing and we couldn't execute SMP tests of our 
>>> system when running Linux + qemu + VM w/ L4Re.
>>> In fact, the kernel did not recognize any APs at all. On AMD CPUs the 
>>> kernel checks for the number of cores reported in CPUID leaf 
>>> 0x8000_0008.ECX[NC] or [ApicIdSize].  [0][1]
>>>
>>> The physical machine reports for leaf 0x8000_0008:  EAX: 0x3030 EBX: 
>>> 0x18cf757 ECX: 0x703f EDX: 0x1000
>>> The lower four bits of ECX are the [NC] field and all set.
>>>
>>> When querying inside qemu with -enable-kvm -cpu host -smp 4 (basically as 
>>> replacement and addition to the above cmdline) the CPUID leaf shows: EAX: 
>>> 0x3024, EBX: 0x1001000, ECX: 0x0, EDX: 0x0
>>> Note, ECX is zero. Indicating that this is no SMP capabale CPU.
>>>
>>> I'm debugging it using my local machine and the QEMU provided EPYC-v2
>>> CPU model and it is reproducible there as well and reports:  EAX:
>>> 0x3028, EBX: 0x0, ECX: 0x0, EDX: 0x0
>>>
>>> I checked other AMD based CPU models (phenom, opteron_g3/g5) and they 
>>> behave the same. [2] shows the CPUID 0x8000'0008 handling in the QEMU 
>>> source.
>>> I believe that behavior here is wrong as ECX[NC] should report the number 
>>> of cores per processor, as stated in the AMD manual [2] p.584. In my 
>>> understanding -smp 4 should then lead to ECX[NC] = 0x3.
>>>
>>> The following table shows my findings with the -smp option:
>>> Option | Qemu guest observed ECX value
>>> -smp 4 | 0x0
>>> -smp 4,cores=4  | 0x3
>>> -smp 4,cores=2,thread=2 | 0x3
>>> -smp 4,cores=4,threads=2 | QEMU boot error: topology false.
>>>
>>> Now, I'm asking myself how the terminology of the AMD manual maps to QEMU's 
>>> -smp option.
>>> Obviously, nr_cores and nr_threads correspond to the cores and threads 
>>> options on the cmdline and cores * threads <= 4 (in this example), but what 
>>> corresponds the X in -smp X to?
>> I'd say X corresponds to number of logical CPUs.
>> Depending on presence of other options these are distributed among them in 
>> magical manner
>> (see pc_smp_parse() for starters)
>>
>>> Querying 0x8000'0008 on the physical processor results in different
>>> reports than quering QEMU's model as does it with -enable-kvm -cpu host.
>>>
>>> Furthermore, the ACPI.MADT shows 4 local APICs to be present while the
>>> CPU leave reports a single core processor.
>> it matches -smp X as it should be.
>>
>>>
>>> This leads me to the conclusion that CPUID 0x8000'0008.ECX reports the
>>> wrong number.
>> CCed author of recent epyc patches who might know how AMD should work better 
>> than me.
>>
>>>
>>> Please let me know, if you need more information from my side.
>>>
>>>
>>> [0] 
>>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fkernkonzept%2Ffiasco%2Fblob%2F522ccc5f29ab120213cf02d71328e2b879cbbd19%2Fsrc%2Fkern%2Fia32%2Fkernel_thread-ia32.cpp%23L109&amp;data=02%7C01%7Cbabu.moger%40amd.com%7C57569f7959744399655b08d7dc8e6e24%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637220379083511672&amp;sdata=hcFJzLAVQoIh5IN9CP%2F9cUQNOZoBnpRA6FliJur1wzQ%3D&amp;reserved=0
>>> [1] 
>>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fkernkonzept%2Ffiasco%2Fblob%2F522ccc5f29ab120213cf02d71328e2b879cbbd19%2Fsrc%2Fkern%2Fia32%2Fcpu-ia32.cpp%23L1120&amp;data=02%7C01%7Cbabu.moger%40amd.com%7C57569f7959744399655b08d7dc8e6e24%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637220379083511672&amp;sdata=ANJIbYKbwfq2bDelH%2FRLKnDPIUZc1BwxHspmgxLU7gs%3D&amp;reserved=0
>>> [2] 
>>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fqemu%2Fqemu%2Fblob%2Ff2a8261110c32c4dccd84e774d8dd7a0524e00fb%2Ftarget%2Fi386%2Fcpu.c%23L5835&amp;data=02%7C01%7Cbabu.moger%40amd.com%7C57569f7959744399655b08d7dc8e6e24%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637220379083511672&amp;sdata=oj3mv9e5YOzUsfUjXK44gC8LybyWgMKo8JBIrRR%2BmDA%3D&amp;reserved=0
>>> [3] 
>>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.amd.com%2Fsystem%2Ffiles%2FTechDocs%2F24594.pdf&amp;data=02%7C01%7Cbabu.moger%40amd.com%7C57569f7959744399655b08d7dc8e6e24%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637220379083511672&amp;sdata=7Yr3J9ihlqSqXCXKN5JJNTByO3NGI%2BGMz2EqBF2Y4hw%3D&amp;reserved=0
>>>
>>> ** Affects: qemu
>>>      Importance: Undecided
>>>          Status: New
>>>
>>
> 

-- 
address@hidden - Tel. 0351-41 883 221
http://www.kernkonzept.com

Kernkonzept GmbH.  Sitz: Dresden.  Amtsgericht Dresden, HRB 31129.
Geschäftsführer: Dr.-Ing. Michael Hohmuth

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1871842

Title:
  AMD CPUID leaf 0x8000'0008 reported number of cores  inconsistent with
  ACPI.MADT

Status in QEMU:
  New

Bug description:
  Setup:
  CPU: AMD EPYC-v2 or host's EPYC cpu
  Linux 64-bit fedora host; Kernel version 5.5.15-200.fc31
  qemu version: self build
  git-head: f3bac27cc1e303e1860cc55b9b6889ba39dee587
  config: Configured with: '../configure' 
'--target-list=x86_64-softmmu,mips64el-softmmu,mips64-softmmu,mipsel-softmmu,mips-softmmu,i386-softmmu,aarch64-softmmu,arm-softmmu'
 '--prefix=/opt/qemu-master'

  Cmdline: 
  qemu-system-x86_64 -kernel 
/home/peppelt/code/l4/internal/.build-x86_64/bin/amd64_gen/bootstrap -append "" 
-initrd "./fiasco/.build-x86_64/fiasco , ... " -serial stdio -nographic 
-monitor none -nographic -monitor none -cpu EPYC-v2 -m 4G -smp 4 

  Issue:
  We are developing an microkernel operating system called L4Re. We recently 
got an AMD EPYC server for testing and we couldn't execute SMP tests of our 
system when running Linux + qemu + VM w/ L4Re.
  In fact, the kernel did not recognize any APs at all. On AMD CPUs the kernel 
checks for the number of cores reported in CPUID leaf 0x8000_0008.ECX[NC] or 
[ApicIdSize].  [0][1]

  The physical machine reports for leaf 0x8000_0008:  EAX: 0x3030 EBX: 
0x18cf757 ECX: 0x703f EDX: 0x1000
  The lower four bits of ECX are the [NC] field and all set.

  When querying inside qemu with -enable-kvm -cpu host -smp 4 (basically as 
replacement and addition to the above cmdline) the CPUID leaf shows: EAX: 
0x3024, EBX: 0x1001000, ECX: 0x0, EDX: 0x0
  Note, ECX is zero. Indicating that this is no SMP capabale CPU.

  I'm debugging it using my local machine and the QEMU provided EPYC-v2
  CPU model and it is reproducible there as well and reports:  EAX:
  0x3028, EBX: 0x0, ECX: 0x0, EDX: 0x0

  I checked other AMD based CPU models (phenom, opteron_g3/g5) and they behave 
the same. [2] shows the CPUID 0x8000'0008 handling in the QEMU source.
  I believe that behavior here is wrong as ECX[NC] should report the number of 
cores per processor, as stated in the AMD manual [2] p.584. In my understanding 
-smp 4 should then lead to ECX[NC] = 0x3.

  The following table shows my findings with the -smp option:
  Option | Qemu guest observed ECX value
  -smp 4 | 0x0
  -smp 4,cores=4  | 0x3
  -smp 4,cores=2,thread=2 | 0x3
  -smp 4,cores=4,threads=2 | QEMU boot error: topology false.

  Now, I'm asking myself how the terminology of the AMD manual maps to QEMU's 
-smp option.
  Obviously, nr_cores and nr_threads correspond to the cores and threads 
options on the cmdline and cores * threads <= 4 (in this example), but what 
corresponds the X in -smp X to?

  Querying 0x8000'0008 on the physical processor results in different
  reports than quering QEMU's model as does it with -enable-kvm -cpu
  host.

  Furthermore, the ACPI.MADT shows 4 local APICs to be present while the
  CPU leave reports a single core processor.

  This leads me to the conclusion that CPUID 0x8000'0008.ECX reports the
  wrong number.

  
  Please let me know, if you need more information from my side.

  
  [0] 
https://github.com/kernkonzept/fiasco/blob/522ccc5f29ab120213cf02d71328e2b879cbbd19/src/kern/ia32/kernel_thread-ia32.cpp#L109
  [1] 
https://github.com/kernkonzept/fiasco/blob/522ccc5f29ab120213cf02d71328e2b879cbbd19/src/kern/ia32/cpu-ia32.cpp#L1120
  [2] 
https://github.com/qemu/qemu/blob/f2a8261110c32c4dccd84e774d8dd7a0524e00fb/target/i386/cpu.c#L5835
  [3] https://www.amd.com/system/files/TechDocs/24594.pdf

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1871842/+subscriptions



reply via email to

[Prev in Thread] Current Thread [Next in Thread]