Re: [PATCH v3 3/4] hw/acpi/aml-build: Use existing CPU topology to build

qemu-arm

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v3 3/4] hw/acpi/aml-build: Use existing CPU topology to build

From:	Gavin Shan
Subject:	Re: [PATCH v3 3/4] hw/acpi/aml-build: Use existing CPU topology to build PPTT table
Date:	Sun, 3 Apr 2022 22:40:10 +0800
User-agent:	Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.2.0

Hi Igor,

On 3/30/22 10:10 PM, Igor Mammedov wrote:

On Wed, 23 Mar 2022 15:24:37 +0800
Gavin Shan <gshan@redhat.com> wrote:

When the PPTT table is built, the CPU topology is re-calculated, but
it's unecessary because the CPU topology has been populated in
virt_possible_cpu_arch_ids() on arm/virt machine.

This avoids to re-calculate the CPU topology by reusing the existing
one in ms->possible_cpus. Currently, the only user of build_pptt() is
arm/virt machine.

Signed-off-by: Gavin Shan <gshan@redhat.com>
---
  hw/acpi/aml-build.c | 96 +++++++++++++++++++++++++++++++++------------
  1 file changed, 72 insertions(+), 24 deletions(-)

diff --git a/hw/acpi/aml-build.c b/hw/acpi/aml-build.c
index 4086879ebf..10a2d63b96 100644
--- a/hw/acpi/aml-build.c
+++ b/hw/acpi/aml-build.c
@@ -2002,18 +2002,27 @@ void build_pptt(GArray *table_data, BIOSLinker *linker, 
MachineState *ms,
                  const char *oem_id, const char *oem_table_id)
  {
      MachineClass *mc = MACHINE_GET_CLASS(ms);
+    CPUArchIdList *cpus = ms->possible_cpus;
+    GQueue *socket_list = g_queue_new();
+    GQueue *cluster_list = g_queue_new();
+    GQueue *core_list = g_queue_new();
      GQueue *list = g_queue_new();
      guint pptt_start = table_data->len;
      guint parent_offset;
      guint length, i;
-    int uid = 0;
-    int socket;
+    int n, socket_id, cluster_id, core_id, thread_id;
      AcpiTable table = { .sig = "PPTT", .rev = 2,
                          .oem_id = oem_id, .oem_table_id = oem_table_id };

acpi_table_begin(&table, table_data);- for (socket = 0; socket < ms->smp.sockets; socket++) {

+    for (n = 0; n < cpus->len; n++) {
+        socket_id = cpus->cpus[n].props.socket_id;
+        if (g_queue_find(socket_list, GUINT_TO_POINTER(socket_id))) {
+            continue;
+        }


maybe instead of scanning cpus[n] every time for each topology level
and trying to keep code flattened (which mimics PPTT fattened tree
table for not much of the reason, spec doesn't require entries
from the same level to e described contiguously),
try to rebuild hierarchy tree from flat cpus[n] in 1 pass first
and then use nested loops or recursion to build PPTT table,
something like:

  sockets = cpus_to_topo(possible)
  build_pptt_level(items = sockets, parent_ref = 0)
   for item in items
      level_ref = table_data->len - pptt_start
      build_processor_hierarchy_node(item {id, flags, ...}, parent_ref)
      if not leaf:
         build_pptt_level(item, level_ref)

which is much more compact and easier to read compared to
unrolled impl. it's now with all push/pop stack emulation.


I missed your comments when v4 was posted. Sorry about this. I'm using
thunderbird mail client and have some filters running to put incoming
mails into the corresponding folders, but this reply has been put into
wrong folder. It's why I always copy my private email while sending
patches and emails. Please ignore v4 and review v5 directly.

Thanks for the suggestion, but it's going to introduce duplicate entries
for same socket/cluster/core, or I missed something. Otherwise, the
CPUs need to be iterated to check if they're in the corresponding level.

In order to make it simplified and remove the stack emulation stuff,
I will introduce variables to track the socket/cluster/core IDs whose
ACPI table entries have been added. Once the socket, cluster or core ID
changes while iterating 'ms->possible_cpus', the corresponding ACPI table
entry is added and the IDs for child levels are invalidated. With this,
all needed ACPI table entries will be created in one loop on 'ms->possible_cpus'

The changes will be included to v5, which will be posted shortly.

Thanks,
Gavin

+
+        g_queue_push_tail(socket_list, GUINT_TO_POINTER(socket_id));
          g_queue_push_tail(list,
              GUINT_TO_POINTER(table_data->len - pptt_start));
          build_processor_hierarchy_node(
@@ -2023,65 +2032,104 @@ void build_pptt(GArray *table_data, BIOSLinker 
*linker, MachineState *ms,
               * of a physical package
               */
              (1 << 0),
-            0, socket, NULL, 0);
+            0, socket_id, NULL, 0);
      }

if (mc->smp_props.clusters_supported) {

          length = g_queue_get_length(list);
          for (i = 0; i < length; i++) {
-            int cluster;
-
              parent_offset = GPOINTER_TO_UINT(g_queue_pop_head(list));
-            for (cluster = 0; cluster < ms->smp.clusters; cluster++) {
+            socket_id = GPOINTER_TO_UINT(g_queue_pop_head(socket_list));
+
+            for (n = 0; n < cpus->len; n++) {
+                if (cpus->cpus[n].props.socket_id != socket_id) {
+                    continue;
+                }
+
+                cluster_id = cpus->cpus[n].props.cluster_id;
+                if (g_queue_find(cluster_list, GUINT_TO_POINTER(cluster_id))) {
+                    continue;
+                }
+
+                g_queue_push_tail(cluster_list, GUINT_TO_POINTER(cluster_id));
                  g_queue_push_tail(list,
                      GUINT_TO_POINTER(table_data->len - pptt_start));
                  build_processor_hierarchy_node(
                      table_data,
                      (0 << 0), /* not a physical package */
-                    parent_offset, cluster, NULL, 0);
+                    parent_offset, cluster_id, NULL, 0);
              }
          }
      }

length = g_queue_get_length(list);

      for (i = 0; i < length; i++) {
-        int core;
-
          parent_offset = GPOINTER_TO_UINT(g_queue_pop_head(list));
-        for (core = 0; core < ms->smp.cores; core++) {
-            if (ms->smp.threads > 1) {
-                g_queue_push_tail(list,
-                    GUINT_TO_POINTER(table_data->len - pptt_start));
-                build_processor_hierarchy_node(
-                    table_data,
-                    (0 << 0), /* not a physical package */
-                    parent_offset, core, NULL, 0);
-            } else {
+        if (!mc->smp_props.clusters_supported) {
+            socket_id = GPOINTER_TO_UINT(g_queue_pop_head(socket_list));
+        } else {
+            cluster_id = GPOINTER_TO_UINT(g_queue_pop_head(cluster_list));
+        }
+
+        for (n = 0; n < cpus->len; n++) {
+            if (!mc->smp_props.clusters_supported &&
+                cpus->cpus[n].props.socket_id != socket_id) {
+                continue;
+            }
+
+            if (mc->smp_props.clusters_supported &&
+                cpus->cpus[n].props.cluster_id != cluster_id) {
+                continue;
+            }
+
+            core_id = cpus->cpus[n].props.core_id;
+            if (ms->smp.threads <= 1) {
                  build_processor_hierarchy_node(
                      table_data,
                      (1 << 1) | /* ACPI Processor ID valid */
                      (1 << 3),  /* Node is a Leaf */
-                    parent_offset, uid++, NULL, 0);
+                    parent_offset, core_id, NULL, 0);
+                continue;
+            }
+
+            if (g_queue_find(core_list, GUINT_TO_POINTER(core_id))) {
+                continue;
              }
+
+            g_queue_push_tail(core_list, GUINT_TO_POINTER(core_id));
+            g_queue_push_tail(list,
+                GUINT_TO_POINTER(table_data->len - pptt_start));
+            build_processor_hierarchy_node(
+                table_data,
+                (0 << 0), /* not a physical package */
+                parent_offset, core_id, NULL, 0);
          }
      }

length = g_queue_get_length(list);

      for (i = 0; i < length; i++) {
-        int thread;
-
          parent_offset = GPOINTER_TO_UINT(g_queue_pop_head(list));
-        for (thread = 0; thread < ms->smp.threads; thread++) {
+        core_id = GPOINTER_TO_UINT(g_queue_pop_head(core_list));
+
+        for (n = 0; n < cpus->len; n++) {
+            if (cpus->cpus[n].props.core_id != core_id) {
+                continue;
+            }
+
+            thread_id = cpus->cpus[n].props.thread_id;
              build_processor_hierarchy_node(
                  table_data,
                  (1 << 1) | /* ACPI Processor ID valid */
                  (1 << 2) | /* Processor is a Thread */
                  (1 << 3),  /* Node is a Leaf */
-                parent_offset, uid++, NULL, 0);
+                parent_offset, thread_id, NULL, 0);
          }
      }

g_queue_free(list);

+    g_queue_free(core_list);
+    g_queue_free(cluster_list);
+    g_queue_free(socket_list);
      acpi_table_end(linker, &table);
  }

[Prev in Thread]

Current Thread

[Next in Thread]

Re: [PATCH v3 3/4] hw/acpi/aml-build: Use existing CPU topology to build PPTT table, Gavin Shan <=

Prev by Date: Re: [PATCH v4 0/3] hw/arm/virt: Fix CPU's default NUMA node ID
Next by Date: [PATCH v5 0/4] hw/arm/virt: Fix CPU's default NUMA node ID
Previous by thread: [PATCH v4 0/3] hw/arm/virt: Fix CPU's default NUMA node ID
Next by thread: [PATCH v5 0/4] hw/arm/virt: Fix CPU's default NUMA node ID
Index(es):
- Date
- Thread