|
From: | Gavin Shan |
Subject: | Re: [PATCH] hw/arm/boot: Use NUMA node ID in memory node name |
Date: | Thu, 3 Jun 2021 14:48:03 +1000 |
User-agent: | Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.2.0 |
Hi Drew, On 6/2/21 9:36 PM, Andrew Jones wrote:
On Wed, Jun 02, 2021 at 11:09:32AM +1000, Gavin Shan wrote:On 6/1/21 5:50 PM, Andrew Jones wrote:On Tue, Jun 01, 2021 at 03:30:04PM +0800, Gavin Shan wrote:We possibly populate empty nodes where memory isn't included and might be hot added at late time. The FDT memory nodes can't be created due to conflicts on their names if multiple empty nodes are specified. For example, the VM fails to start with the following error messages. /home/gavin/sandbox/qemu.main/build/qemu-system-aarch64 \ -accel kvm -machine virt,gic-version=host \ -cpu host -smp 4,sockets=2,cores=2,threads=1 -m 1024M,maxmem=64G \ -object memory-backend-ram,id=mem0,size=512M \ -object memory-backend-ram,id=mem1,size=512M \ -numa node,nodeid=0,cpus=0-1,memdev=mem0 \ -numa node,nodeid=1,cpus=2-3,memdev=mem1 \ -numa node,nodeid=2 \ -numa node,nodeid=3 \ : -device virtio-balloon-pci,id=balloon0,free-page-reporting=yes qemu-system-aarch64: FDT: Failed to create subnode /memory@80000000: \ FDT_ERR_EXISTS This fixes the issue by using NUMA node ID or zero in the memory node name to avoid the conflicting memory node names. With this applied, the VM can boot successfully with above command lines. Signed-off-by: Gavin Shan <gshan@redhat.com> --- hw/arm/boot.c | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/hw/arm/boot.c b/hw/arm/boot.c index d7b059225e..3169bdf595 100644 --- a/hw/arm/boot.c +++ b/hw/arm/boot.c @@ -432,7 +432,12 @@ static int fdt_add_memory_node(void *fdt, uint32_t acells, hwaddr mem_base, char *nodename; int ret; - nodename = g_strdup_printf("/memory@%" PRIx64, mem_base); + if (numa_node_id >= 0) { + nodename = g_strdup_printf("/memory@%d", numa_node_id); + } else { + nodename = g_strdup("/memory@0"); + } + qemu_fdt_add_subnode(fdt, nodename); qemu_fdt_setprop_string(fdt, nodename, "device_type", "memory"); ret = qemu_fdt_setprop_sized_cells(fdt, nodename, "reg", acells, mem_base,[...]Is it conventional to use the unit-address like this? If so, can you point out where that's documented? If it's not conventional, then we shouldn't do it. And then I'm not sure what we should do in this case. Here's a couple links I found, but they don't really help... https://devicetree-specification.readthedocs.io/en/latest/chapter2-devicetree-basics.html#sect-node-names https://devicetree-specification.readthedocs.io/en/latest/chapter3-devicenodes.html#memory-nodeAs stated in the document (section 2.2.1.1). It's conventional to take the first address of 'reg' property as unit-address, but it's not mandatory as I understand: (1) In section 2.2.1.1, the bus can specify additional format to unit-address.The bus type we're using is the physical memory map. Does it allow this use of the unit-address? I still haven't seen that documented anywhere.(2) The device node name isn't used to identify the device node in Linux kernel. They are actually identified by 'device_type' property. (drivers/of/fdt.c::early_init_dt_scan_memory())This doesn't matter. We can't change DT node name formats just because they may not impact Linux. We need to follow the DT standard and the Linux convention.I think it's still nice to include the unit-address in meory node's name. For the conflicting nodes, we add more suffix like below. I can update the code in v2 if it's preferred way to go.I don't think we should change the semantics of the unit address at all, not without either a) finding a document that says our use is OK or b) getting it documented in the Linux kernel first.
[...] I've sent one separate mail to check with Rob Herring. Hopefully he have ideas as he is maintaining linux FDT subsystem. You have been included to that thread. I didn't find something meaningful to this thread after doing some google search either. Yes, I agree with you we need to follow the specification strictly. It seems it's uncertain about the 'physical memory map' bus binding requirements. Thanks, Gavin
[Prev in Thread] | Current Thread | [Next in Thread] |