qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH for-6.2 v5 3/5] hw/arm/virt: Add cpu-map to device tree


From: wangyanan (Y)
Subject: Re: [PATCH for-6.2 v5 3/5] hw/arm/virt: Add cpu-map to device tree
Date: Tue, 17 Aug 2021 10:10:44 +0800
User-agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:78.0) Gecko/20100101 Thunderbird/78.4.0

Hi,
On 2021/8/5 20:39, Yanan Wang wrote:
From: Andrew Jones <drjones@redhat.com>

Support device tree CPU topology descriptions.

In accordance with the Devicetree Specification, the Linux Doc
"arm/cpus.yaml" requires that cpus and cpu nodes in the DT are
present. And we have already met the requirement by generating
/cpus/cpu@* nodes for members within ms->smp.cpus. Accordingly,
we should also create subnodes in cpu-map for the present cpus,
each of which relates to an unique cpu node.

The Linux Doc "cpu/cpu-topology.txt" states that the hierarchy
of CPUs in a SMP system is defined through four entities and
they are socket/cluster/core/thread. It is also required that
a socket node's child nodes must be one or more cluster nodes.
Given that currently we are only provided with information of
socket/core/thread, we assume there is one cluster child node
in each socket node when creating cpu-map.

Signed-off-by: Andrew Jones <drjones@redhat.com>
Co-developed-by: Yanan Wang <wangyanan55@huawei.com>
Signed-off-by: Yanan Wang <wangyanan55@huawei.com>
---
  hw/arm/virt.c | 59 ++++++++++++++++++++++++++++++++++++++++++---------
  1 file changed, 49 insertions(+), 10 deletions(-)

diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index 82f2eba6bd..d1e294be95 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -350,20 +350,21 @@ static void fdt_add_cpu_nodes(const VirtMachineState *vms)
      int cpu;
      int addr_cells = 1;
      const MachineState *ms = MACHINE(vms);
+    const VirtMachineClass *vmc = VIRT_MACHINE_GET_CLASS(vms);
      int smp_cpus = ms->smp.cpus;
/*
-     * From Documentation/devicetree/bindings/arm/cpus.txt
-     *  On ARM v8 64-bit systems value should be set to 2,
-     *  that corresponds to the MPIDR_EL1 register size.
-     *  If MPIDR_EL1[63:32] value is equal to 0 on all CPUs
-     *  in the system, #address-cells can be set to 1, since
-     *  MPIDR_EL1[63:32] bits are not used for CPUs
-     *  identification.
+     * See Linux Documentation/devicetree/bindings/arm/cpus.yaml
+     * On ARM v8 64-bit systems value should be set to 2,
+     * that corresponds to the MPIDR_EL1 register size.
+     * If MPIDR_EL1[63:32] value is equal to 0 on all CPUs
+     * in the system, #address-cells can be set to 1, since
+     * MPIDR_EL1[63:32] bits are not used for CPUs
+     * identification.
       *
-     *  Here we actually don't know whether our system is 32- or 64-bit one.
-     *  The simplest way to go is to examine affinity IDs of all our CPUs. If
-     *  at least one of them has Aff3 populated, we set #address-cells to 2.
+     * Here we actually don't know whether our system is 32- or 64-bit one.
+     * The simplest way to go is to examine affinity IDs of all our CPUs. If
+     * at least one of them has Aff3 populated, we set #address-cells to 2.
       */
      for (cpu = 0; cpu < smp_cpus; cpu++) {
          ARMCPU *armcpu = ARM_CPU(qemu_get_cpu(cpu));
@@ -406,8 +407,46 @@ static void fdt_add_cpu_nodes(const VirtMachineState *vms)
                  ms->possible_cpus->cpus[cs->cpu_index].props.node_id);
          }
+ if (!vmc->no_cpu_topology) {
+            qemu_fdt_setprop_cell(ms->fdt, nodename, "phandle",
+                                  qemu_fdt_alloc_phandle(ms->fdt));
+        }
+
          g_free(nodename);
      }
+
+    if (!vmc->no_cpu_topology) {
+        /*
+         * See Linux Documentation/devicetree/bindings/cpu/cpu-topology.txt
+         * In a SMP system, the hierarchy of CPUs is defined through four
+         * entities that are used to describe the layout of physical CPUs
+         * in the system: socket/cluster/core/thread.
+         */
+        qemu_fdt_add_subnode(ms->fdt, "/cpus/cpu-map");
+
+        for (cpu = smp_cpus - 1; cpu >= 0; cpu--) {
+            char *cpu_path = g_strdup_printf("/cpus/cpu@%d", cpu);
+            char *map_path;
+
+            if (ms->smp.threads > 1) {
+                map_path = g_strdup_printf(
+                    "/cpus/cpu-map/socket%d/cluster0/core%d/thread%d",
+                    cpu / (ms->smp.cores * ms->smp.threads),
+                    (cpu / ms->smp.threads) % ms->smp.cores,
+                    cpu % ms->smp.threads);
It seems that there is some discrepancy between the documentation
(Documentation/devicetree/bindings/cpu/cpu-topology.txt) and the
actual implementation of DT topology parser for ARM64
(function parse_dt_topology() in drivers/base/arch_topology.c).

The doc says the cpu-map node's child nodes can be:
    - one or more cluster nodes or
    - one or more socket nodes in a multi-socket system
which means a cpu-map can be defined as two formats such as:
1) cpu-map
                   socket0
                                cluster0
                                             core0
                                             core1
                                cluster1
                                             core0
                                             core1
                   socket1
                                cluster0
                                             core0
                                             core1
                                cluster1
                                             core0
                                             core1

2) cpu-map
                   cluster0
                                cluster0
                                             core0
                                             core1
                                cluster1
                                             core0
                                             core1
                   cluster1
                                cluster0
                                             core0
                                             core1
                                cluster1
                                             core0
                                             core1

But current parser only assumes that there are nested clusters within
cpu-map and is unaware of socket, the parser also ignore any information
about the nesting of clusters and present the scheduler with a flat list of
them. So based on current parser, we will get 4 packages (sockets) totally,
2 cores per package, 1 threads per core from 2), but will get nothing
useful from 1).

I think the ARM64 kernel DT parser should be optimized so that it's
also aware of sockets and can parse both formats of cpu-map. But
before this, I think we still have to build the cpu-map in format 2) if
we hope to describe topology successfully through DT. :)

Thanks,
Yanan
.
+            } else {
+                map_path = g_strdup_printf(
+                    "/cpus/cpu-map/socket%d/cluster0/core%d",
+                    cpu / ms->smp.cores,
+                    cpu % ms->smp.cores);
+            }
+            qemu_fdt_add_path(ms->fdt, map_path);
+            qemu_fdt_setprop_phandle(ms->fdt, map_path, "cpu", cpu_path);
+
+            g_free(map_path);
+            g_free(cpu_path);
+        }
+    }
  }
static void fdt_add_its_gic_node(VirtMachineState *vms)




reply via email to

[Prev in Thread] Current Thread [Next in Thread]