Re: [PATCH] spapr: Add a new level of NUMA for GPUs

qemu-ppc

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH] spapr: Add a new level of NUMA for GPUs

From:	Reza Arbab
Subject:	Re: [PATCH] spapr: Add a new level of NUMA for GPUs
Date:	Wed, 13 May 2020 19:19:02 -0500

Hi David,

Thanks for your quick response!

On Mon, May 11, 2020 at 04:17:45PM +1000, David Gibson wrote:

1)

This would all be much simpler, if PAPR's representation of NUMA
distances weren't so awful.  Somehow it manages to be both so complex
that it's very hard to understand, and yet very limited in that it
has no way to represent distances in any absolute units, or even
specific ratios between distances.

Both qemu and the guest kernel can have an arbitrary set of nodes,
with an arbitrary matrix of distances between each pair, which we then
have to lossily encode into this PAPR nonsense.

Completely agree. I've revisited that section many times now and stillfind the descriptions of these properties almost incomprehensible.

The only way I've been able to make sense of it comes from reading theimplementation, or experimentally tweaking the code/device tree to seeif distances behave the way I expect.

2)

Alas, I don't think we can simply change this information.  We'll have
to do it conditionally on a new machine type.  This is guest visible
information, which shouldn't change under a running guest across
migration between different qemu versions.  At least for Linux guests
we'd probably get away with it, since I think it only reads this info
at boot, and across a migration we'd at worst get non-optimal
behaviour, not actual breakage.

Sure, that makes sense. I'll try making the change conditional on a flagthat can be set on new machine types.

3)

I'm not sure that this version is totally correct w.r.t. PAPR.  But
then, I'm also not really sure of that for the existing version.
Specifically it's not at all clear from PAPR if the IDs used at each
level of the ibm,associativity need to be:
  a) globally unique
  b) unique only within the associativity level they appear at
or c) unique only within the "node" at the next higher level they
     belong to


Again, I'm no authority but it seems to be (b):

  "To determine the associativity between any two resources, the OS

scans down the two resources associativity lists in all pair wisecombinations counting how many domains are the same until the firstdomain where the two list do not agree."

FWIW, using the same number for id at multiple levels has been workingin practice.

4)

I'm not totally clear on the rationale for using the individual gpu's
numa ID at all levels, rather than just one.  I'm guessing this is so
that the gpu memory blocks are distant from each other as well as
distant from main memory.  Is that right?

It's not necessary to use it at all levels--that was me trying tocompletely replicate that old firmware change. Strictly speaking, sinceonly reference-points 4 (and now 2) are significant, that part couldjust be:


@@ -362,7 +362,7 @@ void spapr_phb_nvgpu_ram_populate_dt(SpaprPhbState *sphb, 
void *fdt)
         uint32_t associativity[] = {
             cpu_to_be32(0x4),
             SPAPR_GPU_NUMA_ID,
-            SPAPR_GPU_NUMA_ID,
+            cpu_to_be32(nvslot->numa_id),
             SPAPR_GPU_NUMA_ID,
             cpu_to_be32(nvslot->numa_id)
         };

I think the rationale is that if those other levels got added toreference-points in the future, you'd likely want the GPU to be distinctthere too.


--
Reza Arbab

[Prev in Thread]

Current Thread

[Next in Thread]

[PATCH] spapr: Add a new level of NUMA for GPUs, Reza Arbab, 2020/05/08
- Re: [PATCH] spapr: Add a new level of NUMA for GPUs, David Gibson, 2020/05/11
  - Re: [PATCH] spapr: Add a new level of NUMA for GPUs, Reza Arbab <=

Prev by Date: [PATCH v2 6/6] target/ppc: Don't update radix PTE R/C bits with gdbstub
Next by Date: Re: [RESEND PATCH v3 1/1] ppc/spapr: Add hotremovable flag on DIMM LMBs on drmem_v2
Previous by thread: Re: [PATCH] spapr: Add a new level of NUMA for GPUs
Next by thread: [PATCH 00/11] exec/cpu: Poison 'hwaddr' type in user-mode emulation
Index(es):
- Date
- Thread