Re: [PATCH v3] spapr: Add a new level of NUMA for GPUs

qemu-ppc

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v3] spapr: Add a new level of NUMA for GPUs

From:	Daniel Henrique Barboza
Subject:	Re: [PATCH v3] spapr: Add a new level of NUMA for GPUs
Date:	Thu, 16 Jul 2020 06:42:11 -0300
User-agent:	Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.10.0



On 7/16/20 2:04 AM, David Gibson wrote:

On Mon, May 25, 2020 at 12:49:27PM -0500, Reza Arbab wrote:

On Mon, May 25, 2020 at 03:05:50PM +1000, David Gibson wrote:

On Fri, May 22, 2020 at 02:53:33PM -0500, Reza Arbab wrote:

--- a/hw/ppc/spapr_pci_nvlink2.c
+++ b/hw/ppc/spapr_pci_nvlink2.c
@@ -362,7 +362,7 @@ void spapr_phb_nvgpu_ram_populate_dt(SpaprPhbState *sphb, 
void *fdt)
          uint32_t associativity[] = {
              cpu_to_be32(0x4),
              SPAPR_GPU_NUMA_ID,
-            SPAPR_GPU_NUMA_ID,
+            cpu_to_be32(nvslot->numa_id),
              SPAPR_GPU_NUMA_ID,
              cpu_to_be32(nvslot->numa_id)



This doesn't look quite right.  In the new case we'll get {
GPU_NUMA_ID, nvslot->numa_id, GPU_NUMA_ID, nvslot->numa_id }.


The associativity reference points are 4 (and now 2), so this is what we
want. I think what you've noticed is that reference points are 1-based
ordinals:

        "...the “ibm,associativity-reference-points” property indicates
boundaries between associativity domains presented by the
“ibm,associativity” property containing “near” and “far” resources. The
first such boundary in the list represents the 1 based ordinal in the
associativity lists of the most significant boundary, with subsequent
entries indicating progressively less significant boundaries."


Right.. AIUI, associativity-reference-points indicates which leves are
"important" from a NUMA distance point of view (though the spec is
very confusing).  But, I'm pretty sure, that ignoring
reference-points, the individual ibm,associativity lists are supposed
to describe a correct hierarchy, even if some levels will get ignored
for distance purposes.  So once you've split up into "numa_id" nodes
at the second level, you can't then go back to just 2 nodes (main
vs. gpu) at the third.



I believe Reza should go with what Skiboot already does in this situation:

(hw/npu2.c)

dt_add_property_cells(mem, "ibm,associativity", 4, chip_id, chip_id, chip_id, 
chip_id);

Which would translate here to:

        uint32_t associativity[] = {
            cpu_to_be32(0x4),
            cpu_to_be32(nvslot->numa_id),
            cpu_to_be32(nvslot->numa_id),
            cpu_to_be32(nvslot->numa_id),
            cpu_to_be32(nvslot->numa_id),
        };


In the end it doesn't matter for the logic since the refpoints are always
0x4 0x4 0x2, meaning that we're ignoring the 1st and 3rd elements entirely
anyways, but at least make the intention clearer: GPUs are always at the
maximum distance from everything else.



Thanks,


DHB

[Prev in Thread]

Current Thread

[Next in Thread]

Re: [PATCH v3] spapr: Add a new level of NUMA for GPUs, David Gibson, 2020/07/16
- Re: [PATCH v3] spapr: Add a new level of NUMA for GPUs, Daniel Henrique Barboza <=
  - Re: [PATCH v3] spapr: Add a new level of NUMA for GPUs, Reza Arbab, 2020/07/16
    - Re: [PATCH v3] spapr: Add a new level of NUMA for GPUs, Daniel Henrique Barboza, 2020/07/16

Prev by Date: Re: [PATCH] acpi: Fix access to PM1 control and status registers
Next by Date: Re: [PATCH] spapr_pci: Robustify support of PCI bridges
Previous by thread: Re: [PATCH v3] spapr: Add a new level of NUMA for GPUs
Next by thread: Re: [PATCH v3] spapr: Add a new level of NUMA for GPUs
Index(es):
- Date
- Thread