qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH 0/7] spapr: rework memory nodes


From: Eduardo Habkost
Subject: Re: [Qemu-devel] [PATCH 0/7] spapr: rework memory nodes
Date: Tue, 17 Jun 2014 11:07:00 -0300
User-agent: Mutt/1.5.23 (2014-03-12)

On Tue, Jun 17, 2014 at 03:51:35PM +1000, Alexey Kardashevskiy wrote:
> On 06/17/2014 06:51 AM, Eduardo Habkost wrote:
> > On Mon, Jun 16, 2014 at 06:16:29PM +1000, Alexey Kardashevskiy wrote:
> >> On 06/16/2014 05:53 PM, Alexey Kardashevskiy wrote:
> >>> c4177479 "spapr: make sure RMA is in first mode of first memory node"
> >>> introduced regression which prevents from running guests with memoryless
> >>> NUMA node#0 which may happen on real POWER8 boxes and which would make
> >>> sense to debug in QEMU.
> >>>
> >>> This patchset aim is to fix that and also fix various code problems in
> >>> memory nodes generation.
> >>>
> >>> These 2 patches could be merged (the resulting patch looks rather ugly):
> >>> spapr: Use DT memory node rendering helper for other nodes
> >>> spapr: Move DT memory node rendering to a helper
> >>>
> >>> Please comment. Thanks!
> >>>
> >>
> >> Sure I forgot to add an example of what I am trying to run without errors
> >> and warnings:
> >>
> >> /home/aik/qemu-system-ppc64 \
> >> -enable-kvm \
> >> -machine pseries \
> >> -nographic \
> >> -vga none \
> >> -drive id=id0,if=none,file=virtimg/fc20_24GB.qcow2,format=qcow2 \
> >> -device scsi-disk,id=id1,drive=id0 \
> >> -m 2080 \
> >> -smp 8 \
> >> -numa node,nodeid=0,cpus=0-7,memory=0 \
> >> -numa node,nodeid=2,cpus=0-3,mem=1040 \
> >> -numa node,nodeid=4,cpus=4-7,mem=1040
> > 
> > (Note: I will ignore the "cpus" argument for the discussion below.)
> 
> The example is quite bad, I should not have used same CPUs in 2 nodes.
> SPAPR allows this but QEMU does not really support this and I am not
> touching this now.
> 
> 
> > 
> > I understand now that the non-contiguous node IDs are guest-visible.
> > 
> > But I still would like to understand the motivations for your use case,
> > to understand which solution makes more sense.
> 
> One of examples is a 2 CPUs on one die, one of CPUs is connected to memory
> bus, the other is not, instead it is connected to the first CPU (via super
> fast bus) and the first CPU acts as a bridge.
> 
> 
> 
> > If you really want 5 nodes, you just need to write this:
> >   -numa node,nodeid=0,cpus=0-7,memory=0 \
> >   -numa node,nodeid=1 \
> >   -numa node,nodeid=2,cpus=0-3,mem=1040 \
> >   -numa node,nodeid=3 \
> >   -numa node,nodeid=4,cpus=4-7,mem=1040
> > 
> > If you just want 3 nodes, you can just write this:
> >   -numa node,nodeid=0,cpus=0-7,memory=0 \
> >   -numa node,nodeid=1,cpus=0-3,mem=1040 \
> >   -numa node,nodeid=4,cpus=4-7,mem=1040
> > 
> > But you seem to claim you need 3 nodes with non-contiguous IDs. In that
> > case, which exactly is the guest-visible difference you expect to get
> > between:
> >   -numa node,nodeid=0,cpus=0-7,memory=0 \
> >   -numa node,nodeid=1 \
> >   -numa node,nodeid=2,cpus=0-3,mem=1040 \
> >   -numa node,nodeid=3 \
> >   -numa node,nodeid=4,cpus=4-7,mem=1040
> > and
> >   -numa node,nodeid=0,cpus=0-7,memory=0 \
> >   -numa node,nodeid=2,cpus=0-3,mem=1040 \
> >   -numa node,nodeid=4,cpus=4-7,mem=1040
> > ?
> > 
> > Because your patch is making both be exactly the same, and I guess you
> > don't want that (otherwise you could simply use the 5-node command-line
> > above and we wouldn't need patch 7/7).
> 
> If it is canonical and kosher way of using NUMA in QEMU, ok, we can use it.
> I just fail to see why we need a requirement for nodes to go consequently
> here. And it confuses me as a user a bit if I can add "-numa
> node,nodeid=22" (no memory, no cpus) but do not get to see it in the guest.

I agree with you it is confusing. But before we support that use case,
we need to make sure auto-allocation is handled properly, because it
would be hard to fix it later without breaking compatibility.

We probably just need a "present" field on struct NodeInfo, so
machine-specific code and auto-allocation code can differentiate nodes
that are not present on the command-line from empty nodes that were
specified in the command-line.

In the meantime, people can use the 5-node example above as a
workaround.

> 
> btw how is it supposed to work with memory hotplug? Current "-numa" does
> not support gaps in memory and I would expect that we will need it. Any
> plans here?

The DIMM device used for memory hotplug has a "node" property, for the
NUMA node ID.

-- 
Eduardo



reply via email to

[Prev in Thread] Current Thread [Next in Thread]