[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] [PATCH 3/7] spapr: Refactor spapr_populate_memory()
From: |
Alexey Kardashevskiy |
Subject: |
Re: [Qemu-devel] [PATCH 3/7] spapr: Refactor spapr_populate_memory() |
Date: |
Tue, 24 Jun 2014 16:14:11 +1000 |
User-agent: |
Mozilla/5.0 (X11; Linux i686 on x86_64; rv:24.0) Gecko/20100101 Thunderbird/24.6.0 |
On 06/24/2014 01:08 PM, Nishanth Aravamudan wrote:
> On 21.06.2014 [13:06:53 +1000], Alexey Kardashevskiy wrote:
>> On 06/21/2014 08:55 AM, Nishanth Aravamudan wrote:
>>> On 16.06.2014 [17:53:49 +1000], Alexey Kardashevskiy wrote:
>>>> Current QEMU does not support memoryless NUMA nodes.
>>>> This prepares SPAPR for that.
>>>>
>>>> This moves 2 calls of spapr_populate_memory_node() into
>>>> the existing loop which handles nodes other than than
>>>> the first one.
>>>>
>>>> Signed-off-by: Alexey Kardashevskiy <address@hidden>
>>>> ---
>>>> hw/ppc/spapr.c | 31 +++++++++++--------------------
>>>> 1 file changed, 11 insertions(+), 20 deletions(-)
>>>>
>>>> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
>>>> index cb3a10a..666b676 100644
>>>> --- a/hw/ppc/spapr.c
>>>> +++ b/hw/ppc/spapr.c
>>>> @@ -689,28 +689,13 @@ static void spapr_populate_memory_node(void *fdt,
>>>> int nodeid, hwaddr start,
>>>>
>>>> static int spapr_populate_memory(sPAPREnvironment *spapr, void *fdt)
>>>> {
>>>> - hwaddr node0_size, mem_start, node_size;
>>>> + hwaddr mem_start, node_size;
>>>> int i;
>>>>
>>>> - /* memory node(s) */
>>>> - if (nb_numa_nodes > 1 && node_mem[0] < ram_size) {
>>>> - node0_size = node_mem[0];
>>>> - } else {
>>>> - node0_size = ram_size;
>>>> - }
>>>> -
>>>> - /* RMA */
>>>> - spapr_populate_memory_node(fdt, 0, 0, spapr->rma_size);
>>>> -
>>>> - /* RAM: Node 0 */
>>>> - if (node0_size > spapr->rma_size) {
>>>> - spapr_populate_memory_node(fdt, 0, spapr->rma_size,
>>>> - node0_size - spapr->rma_size);
>>>> - }
>>>> -
>>>> - /* RAM: Node 1 and beyond */
>>>> - mem_start = node0_size;
>>>> - for (i = 1; i < nb_numa_nodes; i++) {
>>>> + for (i = 0, mem_start = 0; i < nb_numa_nodes; ++i) {
>>>> + if (!node_mem[i]) {
>>>> + continue;
>>>> + }
>>>
>>> Doesn't this skip memoryless nodes? What actually puts the memoryless
>>> node in the device-tree?
>>
>> It does skip.
>>
>>> And if you were to put them in, wouldn't spapr_populate_memory_node()
>>> fail because we'd be creating two nodes with address@hidden where XXX is the
>>> same (starting address) for both?
>>
>> I cannot do this now - there is no way to tell from the command line where
>> I want NUMA node memory start from so I'll end up with multiple nodes with
>> the same name and QEMU won't start. When NUMA fixes reach upstream, I'll
>> try to work out something on top of that.
>
> Ah I got something here. With the patches I just sent to enable sparse
> NUMA nodes, plus your series rebased on top, here's what I see in a
> Linux LPAR:
>
> qemu-system-ppc64 -machine pseries,accel=kvm,usb=off -m 4096 -realtime
> mlock=off -numa node,nodeid=3,mem=4096,cpus=2-3 -numa
> node,nodeid=2,mem=0,cpus=0-1 -smp 4
>
> info numa
> 2 nodes
> node 2 cpus: 0 1
> node 2 size: 0 MB
> node 3 cpus: 2 3
> node 3 size: 4096 MB
>
> numactl --hardware
> available: 3 nodes (0,2-3)
> node 0 cpus:
> node 0 size: 0 MB
> node 0 free: 0 MB
> node 2 cpus: 0 1
> node 2 size: 0 MB
> node 2 free: 0 MB
> node 3 cpus: 2 3
> node 3 size: 4073 MB
> node 3 free: 3830 MB
> node distances:
> node 0 2 3
> 0: 10 40 40
> 2: 40 10 40
> 3: 40 40 10
>
> The trick, it seems, is if you have a memoryless node, it needs to
> have CPUs assigned to it.
Yep. The device tree does not have NUMA nodes, it only has CPUs and
address@hidden (memory banks?) and the guest kernel has to parse
ibm,associativity and reconstruct the NUMA topology. If some node is not
mentioned in any ibm,associativity, it does not exist.
> The CPU's "ibm,associativity" property will
> make Linux set up the proper NUMA topology.
>
> Thoughts? Should there be a check that every "present" NUMA node at
> least either has CPUs or memory.
May be, I'll wait for NUMA stuff in upstream, apply your patch(es), my
patches and see what I get :)
> It seems like if neither are present,
> we can just hotplug them later?
hotplug what? NUMA nodes?
> Does qemu support topology for PCI devices?
Nope.
--
Alexey
- Re: [Qemu-devel] [PATCH 3/7] spapr: Refactor spapr_populate_memory(), (continued)
Re: [Qemu-devel] [PATCH 3/7] spapr: Refactor spapr_populate_memory(), Nishanth Aravamudan, 2014/06/20
Re: [Qemu-devel] [PATCH 3/7] spapr: Refactor spapr_populate_memory(), Nishanth Aravamudan, 2014/06/23
Re: [Qemu-devel] [PATCH 3/7] spapr: Refactor spapr_populate_memory(),
Alexey Kardashevskiy <=
Re: [Qemu-devel] [PATCH 3/7] spapr: Refactor spapr_populate_memory(), Nishanth Aravamudan, 2014/06/24
[Qemu-devel] [PATCH 5/7] spapr: Add a helper for node0_size calculation, Alexey Kardashevskiy, 2014/06/16
[Qemu-devel] [PATCH 6/7] spapr: Fix ibm, associativity for memory nodes, Alexey Kardashevskiy, 2014/06/16
[Qemu-devel] [PATCH 2/7] spapr: Use DT memory node rendering helper for other nodes, Alexey Kardashevskiy, 2014/06/16
Re: [Qemu-devel] [PATCH 0/7] spapr: rework memory nodes, Alexey Kardashevskiy, 2014/06/16