qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH for-2.3] numa: pc: fix default VCPU to node mapp


From: Andreas Färber
Subject: Re: [Qemu-devel] [PATCH for-2.3] numa: pc: fix default VCPU to node mapping
Date: Tue, 17 Mar 2015 17:59:48 +0100
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.5.0

Am 17.03.2015 um 17:42 schrieb Eduardo Habkost:
> On Tue, Mar 17, 2015 at 03:48:38PM +0000, Igor Mammedov wrote:
>> since commit
>>    dd0247e0 pc: acpi: mark all possible CPUs as enabled in SRAT
>> Linux kernel actually tries to use CPU to Node mapping from
>> QEMU provided SRAT table instead of discarding it, and that
>> in some cases breaks build_sched_domains() which expects
>> sane mapping where cores/threads belonging to the same socket
>> are on the same NUMA node.
>>
>> With current default round-robin mapping of VCPUs to nodes
>> guest ends-up with cores/threads belonging to the same socket
>> being on different NUMA nodes.
>>
>> For example with following CLI:
>> qemu-kvm -m 4G -smp 5,sockets=1,cores=4,threads=1,maxcpus=8 \
>>          -numa node,nodeid=0 -numa node,nodeid=1
>> 2.6.32 based kernels will hang on boot due to incorrectly build
>> sched_group-s list in update_sd_lb_stats()
>> so comment in QEMU justifying dumb default mapping:
>>  "
>>   guest OSes must cope with this anyway, because there are BIOSes
>>   out there in real machines which also use this scheme.
>>  "
>> isn't really valid.
>>
>> Replacing default mapping withi a manual, where VCPUs belonging to
>> the same socket are on the same NUMA node, fixes issue for
>> guests which can't handle nonsense topology i.e. cnaging CLI to:
>>   -numa node,nodeid=0,cpus=0-3 -numa node,nodeid=1,cpus=4-7
>>
>> So instead of simply scattering VCPUs around nodes, map
>> the same socket VCPUs to the same NUMA node, which is what
>> guest would expect from a sane hardware/BIOS.
>>
>> Signed-off-by: Igor Mammedov <address@hidden>
> 
> I believe the proposed behavior is much better. But if we are going to
> break compatibility, shouldn't we at least do that before the first -rc
> so we get feedback in case it break existing configurations?
> 
> About qemu_cpu_socket_id_from_index(): all qemu-system-* binaries have
> smp_cores and smp_threads available (even if machines ignore it), but
> the default stub can return values that are larger than the number of
> sockets if smp_cores*smp_threads > 1, which would be obviously
> incorrect. Isn't it easier to simply make
> "cpu_index/(smp_cores*smp_sockets)" be the default cpu_index->socket
> mapping function, and allow machine-specific (not arch-specific)
> overrides if necessary?

Agree that the proposed stub solution is not so nice. Can you propose a
MachineClass based solution instead?

The example I keep bringing up for x86 is that the Galileo boards or
even the Minnow boards don't really have sockets, being a SoC.

Thanks,
Andreas

-- 
SUSE Linux GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany
GF: Felix Imendörffer, Jane Smithard, Jennifer Guild, Dilip Upmanyu,
Graham Norton; HRB 21284 (AG Nürnberg)



reply via email to

[Prev in Thread] Current Thread [Next in Thread]