qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH 3/7] spapr: Refactor spapr_populate_memory()


From: Nishanth Aravamudan
Subject: Re: [Qemu-devel] [PATCH 3/7] spapr: Refactor spapr_populate_memory()
Date: Mon, 23 Jun 2014 20:08:56 -0700
User-agent: Mutt/1.5.21 (2010-09-15)

On 21.06.2014 [13:06:53 +1000], Alexey Kardashevskiy wrote:
> On 06/21/2014 08:55 AM, Nishanth Aravamudan wrote:
> > On 16.06.2014 [17:53:49 +1000], Alexey Kardashevskiy wrote:
> >> Current QEMU does not support memoryless NUMA nodes.
> >> This prepares SPAPR for that.
> >>
> >> This moves 2 calls of spapr_populate_memory_node() into
> >> the existing loop which handles nodes other than than
> >> the first one.
> >>
> >> Signed-off-by: Alexey Kardashevskiy <address@hidden>
> >> ---
> >>  hw/ppc/spapr.c | 31 +++++++++++--------------------
> >>  1 file changed, 11 insertions(+), 20 deletions(-)
> >>
> >> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> >> index cb3a10a..666b676 100644
> >> --- a/hw/ppc/spapr.c
> >> +++ b/hw/ppc/spapr.c
> >> @@ -689,28 +689,13 @@ static void spapr_populate_memory_node(void *fdt, 
> >> int nodeid, hwaddr start,
> >>
> >>  static int spapr_populate_memory(sPAPREnvironment *spapr, void *fdt)
> >>  {
> >> -    hwaddr node0_size, mem_start, node_size;
> >> +    hwaddr mem_start, node_size;
> >>      int i;
> >>
> >> -    /* memory node(s) */
> >> -    if (nb_numa_nodes > 1 && node_mem[0] < ram_size) {
> >> -        node0_size = node_mem[0];
> >> -    } else {
> >> -        node0_size = ram_size;
> >> -    }
> >> -
> >> -    /* RMA */
> >> -    spapr_populate_memory_node(fdt, 0, 0, spapr->rma_size);
> >> -
> >> -    /* RAM: Node 0 */
> >> -    if (node0_size > spapr->rma_size) {
> >> -        spapr_populate_memory_node(fdt, 0, spapr->rma_size,
> >> -                                   node0_size - spapr->rma_size);
> >> -    }
> >> -
> >> -    /* RAM: Node 1 and beyond */
> >> -    mem_start = node0_size;
> >> -    for (i = 1; i < nb_numa_nodes; i++) {
> >> +    for (i = 0, mem_start = 0; i < nb_numa_nodes; ++i) {
> >> +        if (!node_mem[i]) {
> >> +            continue;
> >> +        }
> > 
> > Doesn't this skip memoryless nodes? What actually puts the memoryless
> > node in the device-tree?
> 
> It does skip.
> 
> > And if you were to put them in, wouldn't spapr_populate_memory_node()
> > fail because we'd be creating two nodes with address@hidden where XXX is the
> > same (starting address) for both?
> 
> I cannot do this now - there is no way to tell from the command line where
> I want NUMA node memory start from so I'll end up with multiple nodes with
> the same name and QEMU won't start. When NUMA fixes reach upstream, I'll
> try to work out something on top of that.

Ah I got something here. With the patches I just sent to enable sparse
NUMA nodes, plus your series rebased on top, here's what I see in a
Linux LPAR:

qemu-system-ppc64 -machine pseries,accel=kvm,usb=off -m 4096 -realtime 
mlock=off -numa node,nodeid=3,mem=4096,cpus=2-3 -numa 
node,nodeid=2,mem=0,cpus=0-1 -smp 4

info numa
2 nodes
node 2 cpus: 0 1
node 2 size: 0 MB
node 3 cpus: 2 3
node 3 size: 4096 MB

numactl --hardware
available: 3 nodes (0,2-3)
node 0 cpus:
node 0 size: 0 MB
node 0 free: 0 MB
node 2 cpus: 0 1
node 2 size: 0 MB
node 2 free: 0 MB
node 3 cpus: 2 3
node 3 size: 4073 MB
node 3 free: 3830 MB
node distances:
node   0   2   3 
  0:  10  40  40 
  2:  40  10  40 
  3:  40  40  10 

The trick, it seems, is if you have a memoryless node, it needs to
have CPUs assigned to it. The CPU's "ibm,associativity" property will
make Linux set up the proper NUMA topology.

Thoughts? Should there be a check that every "present" NUMA node at
least either has CPUs or memory. It seems like if neither are present,
we can just hotplug them later? Does qemu support topology for PCI
devices?

Thanks,
Nish




reply via email to

[Prev in Thread] Current Thread [Next in Thread]