[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] cpu modelling and hotplug (was: [PATCH RFC 0/4] target-
From: |
Eduardo Habkost |
Subject: |
Re: [Qemu-devel] cpu modelling and hotplug (was: [PATCH RFC 0/4] target-i386: PC socket/core/thread modeling, part 1) |
Date: |
Thu, 23 Apr 2015 10:17:36 -0300 |
User-agent: |
Mutt/1.5.23 (2014-03-12) |
On Thu, Apr 23, 2015 at 05:32:33PM +1000, David Gibson wrote:
> On Tue, Apr 07, 2015 at 02:43:43PM +0200, Christian Borntraeger wrote:
> > We had a call and I was asked to write a summary about our conclusion.
> >
> > The more I wrote, there more I became uncertain if we really came to a
> > conclusion and became more certain that we want to define the QMP/HMP/CLI
> > interfaces first (or quite early in the process)
> >
> > As discussed I will provide an initial document as a discussion starter
> >
> > So here is my current understanding with each piece of information on one
> > line, so
> > that everybody can correct me or make additions:
> >
> > current wrap-up of architecture support
> > -------------------
> > x86
> > - Topology possible
> > - can be hierarchical
> > - interfaces to query topology
> > - SMT: fanout in host, guest uses host threads to back guest vCPUS
> > - supports cpu hotplug via cpu_add
> >
> > power
> > - Topology possible
> > - interfaces to query topology?
>
> For power, topology information is communicated via the
> "ibm,associativity" (and related) properties in the device tree. This
> is can encode heirarchical topologies, but it is *not* bound to the
> socket/core/thread heirarchy. On the guest side in Power there's no
> real notion of "socket", just cores with specified proximities to
> various memory nodes.
>
> > - SMT: Power8: no threads in host and full core passed in due to HW design
> > may change in the future
> >
> > s/390
> > - Topology possible
> > - can be hierarchical
> > - interfaces to query topology
> > - always virtualized via PR/SM LPAR
> > - host topology from LPAR can be heterogenous (e.g. 3 cpus in 1st
> > socket, 4 in 2nd)
> > - SMT: fanout in host, guest uses host threads to back guest vCPUS
> >
> >
> > Current downsides of CPU definitions/hotplug
> > -----------------------------------------------
> > - smp, sockets=,cores=,threads= builds only homogeneous topology
> > - cpu_add does not tell were to add
> > - artificial icc bus construct on x86 for several reasons (link, sysbus not
> > hotpluggable..)
>
> Artificial though it may be, I think having a "cpus" pseudo-bus is not
> such a bad idea
That was considered before[1][2]. We have use cases for adding
additional information about VCPUs to query-cpus, but we could simply
use qom-get for that. The only thing missing is a predictable QOM path
for VCPU objects.
If we provide something like "/cpus/<cpu>" links on all machines,
callers could simply use qom-get to get just the information they need,
instead of getting too much information from query-cpus (which also has
the side-effect of interrupting all running VCPUs to synchronize
register information).
Quoting part of your proposal below:
> Ignoring NUMA topology (I'll come back to that in a moment) qemu
> should really only care about two things:
>
> a) the unit of execution scheduling (a vCPU or "thread")
> b) the unit of plug/unplug
>
[...]
> 3) I'm thinking we'd have a "cpus" virtual bus represented in QOM,
> which would contain the vCMs (also QOM objects). Their existence
> would be generic, though we'd almost certainly use arch and/or machine
> specific subtypes.
>
> 4) There would be a (generic) way of finding the vCPUS (threads) in a
> vCM and the vCM for a specific vCPU.
>
What I propose now is a bit simpler: just a mechanism for enumerating
VCPUs/threads (a), that would replace query-cpus. Later we could also
have a generic mechanism for (b), if we decide to introduce a generic
"CPU module" abstraction for plug/unplug.
A more complex mechanism to enumerating vCMs and the vCPUs inside a vCM
would be a superset of (a), so in theory we wouldn't need both. But I
believe that: 1) we will take some time to define the details of the
vCM/plug/unplug abstractions; 2) we already have use cases today[2] that
could benefit from a generic QOM path for (a).
[1] Message-ID: <address@hidden>
http://article.gmane.org/gmane.comp.emulators.qemu/273463
[2] Message-ID: <address@hidden>
http://article.gmane.org/gmane.comp.emulators.kvm.devel/134625
>
> > discussions
> > -------------------
> > - we want to be able to (most important question, IHMO)
> > - hotplug CPUs on power/x86/s390 and maybe others
> > - define topology information
> > - bind the guest topology to the host topology in some way
> > - to host nodes
> > - maybe also for gang scheduling of threads (might face reluctance from
> > the linux scheduler folks)
> > - not really deeply outlined in this call
> > - QOM links must be allocated at boot time, but can be set later on
> > - nothing that we want to expose to users
> > - Machine provides QOM links that the device_add hotplug mechanism can
> > use to add
> > new CPUs into preallocated slots. "CPUs" can be groups of cores
> > and/or threads.
> > - hotplug and initial config should use same semantics
> > - cpu and memory topology might be somewhat independent
> > --> - define nodes
> > - map CPUs to nodes
> > - map memory to nodes
> >
> > - hotplug per
> > - socket
> > - core
> > - thread
> > ?
> > Now comes the part where I am not sure if we came to a conclusion or not:
> > - hotplug/definition per core (but not per thread) seems to handle all cases
> > - core might have multiple threads ( and thus multiple cpustates)
> > - as device statement (or object?)
> > - mapping of cpus to nodes or defining the topology not really
> > outlined in this call
> >
> > To be defined:
> > - QEMU command line for initial setup
> > - QEMU hmp/qmp interfaces for dynamic setup
>
> So, I can't say I've entirely got my head around this, but here's my
> thoughts so far.
>
> I think the basic problem here is that the fixed socket -> core ->
> thread heirarchy is something from x86 land that's become integrated
> into qemu's generic code where it doesn't entirely make sense.
>
> Ignoring NUMA topology (I'll come back to that in a moment) qemu
> should really only care about two things:
>
> a) the unit of execution scheduling (a vCPU or "thread")
> b) the unit of plug/unplug
>
> Now, returning to NUMA topology. What the guest, and therefore qemu,
> really needs to know is the relative proximity of each thread to each
> block of memory. That usually forms some sort of node heirarchy,
> but it doesn't necessarily correspond to a socket->core->thread
> heirarchy you can see in physical units.
>
> On Power, an arbitrary NUMA node heirarchy can be described in the
> device tree without reference to "cores" or "sockets", so really qemu
> has no business even talking about such units.
>
> IIUC, on x86 the NUMA topology is bound up to the socket->core->thread
> heirarchy so it needs to have a notion of those layers, but ideally
> that would be specific to the pc machine type.
>
> So, here's what I'd propose:
>
> 1) I think we really need some better terminology to refer to the unit
> of plug/unplug. Until someone comes up with something better, I'm
> going to use "CPU Module" (CM), to distinguish from the NUMA baggage
> of "socket" and also to refer more clearly to the thing that goes into
> the socket, rather than the socket itself.
>
> 2) A Virtual CPU Module (vCM) need not correspond to a real physical
> object. For machine types which we want to faithfully represent a
> specific physical machine, it would. For generic or pure virtual
> machines, the vCMs would be as small as possible. So for current
> Power, they'd be one virtual core, for future power (maybe) or s390 a
> single virtual thread. For x86 I'm not sure what they'd be.
>
> 3) I'm thinking we'd have a "cpus" virtual bus represented in QOM,
> which would contain the vCMs (also QOM objects). Their existence
> would be generic, though we'd almost certainly use arch and/or machine
> specific subtypes.
>
> 4) There would be a (generic) way of finding the vCPUS (threads) in a
> vCM and the vCM for a specific vCPU.
>
> 5) A vCM *might* have internal subdivisions into "cores" or "nodes" or
> "chips" or "MCMs" or whatever, but that would be up to the machine
> type specific code, and not represented in the QOM heirarchy.
>
> 6) Obviously we'd need some backwards compat goo to sort out existing
> command line options referring to cores and sockets into the new
> representation. This will need machine type specific hooks - so for
> x86 it would need to set up the right vCM subdivisions and make sure
> the right NUMA topology info goes into ACPI. For -machine pseries I'm
> thinking that "-smp sockets=2,cores=1,threads=4" and "-smp
> sockets=1,cores=2,threads=4" should result in exactly the same thing
> internally.
>
>
> Thoughts?
>
>
> --
> David Gibson | I'll have my music baroque, and my code
> david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_
> _other_
> | _way_ _around_!
> http://www.ozlabs.org/~dgibson
--
Eduardo