qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [RFC PATCH v0 0/9] Generic cpu-core device


From: Zhu Guihua
Subject: Re: [Qemu-devel] [RFC PATCH v0 0/9] Generic cpu-core device
Date: Thu, 24 Dec 2015 09:59:53 +0800
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.4.0


On 12/17/2015 05:58 AM, Igor Mammedov wrote:
On Wed, 16 Dec 2015 16:46:37 +0100
Andreas Färber <address@hidden> wrote:

Am 10.12.2015 um 13:35 schrieb Igor Mammedov:
wrt CLI can't we do something like this?

-device some-cpu-model,socket=x[,core=y[,thread=z]]
That's problematic and where my x86 remodeling got stuck. It works
fine (more or less) to model sockets, cores and hyperthreads for
-smp, but doing it dynamically did not work well. How do you
determine the instance size a socket with N cores and M threads
needs?
-smp defines necessary topology, all one need is to find out
instance size of a core or thread object, it would be probably x86
specific but it's doable, the only thing in current X86 thread
needs to fix is to reserve space for largest APIC type
and replace object_new with object_initialize_with_type.

However if I look from x86 point of view, there isn't need to model
sockets nor cores. The threads are sufficient for QEMU needs.
Dummy sockets/cores are just complicating implementation.

What I'm advocating for is let archs to decide if they should create
CPUs per socket, core or thread.

And for x86 do this at thread level, that way we keep compatibility
with cpu-add, but will also allow which cpu thread to plug with
  'node=n,socket=x,core=y,thread=z'.

Another point in favor of thread granularity for x86 is that competing
hyper-visors are doing that at thread level, QEMU would be worse off
in feature parity if minimal hotplug unit would be a socket.

That also has benefit of being very flexible and would also suit
engineering audience of QEMU, allowing them to build CPUs from
config instead of hardcoding it in code and playing heterogeneous
configurations.

Options 'node=n,socket=x,core=y,thread=z' are just a SMP specific
path, defining where CPU should attached, it could be a QOM path
in the future when we arrive there and have a stable QOM tree.

This means we will cancel the current apic_id in command line?
And we have already realized the options 'node=n' one year ago.

Thanks,
Zhu

Allocations in instance_init are to be avoided with a view to
hot-plug.
So either we have a fully determined socket object or we
need to wire individual objects on the command line. The latter has
bad implications for atomicity and thus hot-unplug. That leaves us
what are these bad implication and how they affect unplug?

If for example x86 CPU thread is fixed to embed child APIC then
to avoid allocations as much as possible or fail gracefully there
is 2 options :
  1: like you've said, reserve all needed space at startup, i.e.
     pre-create empty sockets
  2: fail gracefully in qdev_device_add() if allocation is not possible

for #2 it's not enough to avoid allocations in instance_init()
we also must teach qdev_device_add() to get the size of to be created
object and replace
object_new() with malloc() + object_initialize_with_type(),
that way it's possible to fail allocation gracefully and report error.

Doing that would benefit not only CPUs but every device_add capable
Device and is sufficient for hotplug purposes without overhead of
reserving space for every possible hotplugged device at startup (which
is impossible anyway in generic)
So I'd go for #2 sane device_add impl. vs #1 preallocated objects one
with dynamic properties doing allocations and reporting it via
Error**, something I never finished and could use reviewers and
contributors.
most of dynamic properties are static, looks like what QOM needs
is really static properties so we don't misuse the former and probably
a way to reserve space for declared number of dynamic ones to avoid
allocations in instance_initialize().

Anthony's old suggestion had been to use real socket product names
like Xeon-E5-4242 to get a 6-core, dual-thread socket, without
parameters - unfortunately I still don't see an easy way to define
such a thing today with the flexibility users will undoubtedly want.
I don't see it either and for me it is much harder to remember
what Xeon-E5-4242 is while it's much easier to say:
    I want N [cpu-foo] threads
which in SMP world could be expressed via add N tread objects at
specified locations
    device_add cpu-foo, with optional node=n,socket=x,core=y,thread=z
allows to do it.
And well for x86 there lots of these Xeon-foo/whatever-foo codenames,
which would be nightmare to maintain.

And since the question came up how to detect this, what you guys seem
to keep forgetting is that somewhere there also needs to be a matching
link<> property that determines what can be plugged, i.e. QMP
qom-list. link<>s are the QOM equivalent to qdev's buses. The object
itself needs to live in /machine/peripheral
or /machine/peripheral-anon (/machine/unattached is supposed to go
away after the QOM conversion is done!) and in a machine-specific
place there will be a /machine/cpu-socket[0]
-> /machine/peripheral-anon/device[42] link<x86_64-cpu-socket>
property. It might just as well
be /machine/daughterboard-x/cpu-core[2] -> /machine/peripheral/cpu0.
(Gentle reminder of the s390 ipi modeling discussion that never came
to any conclusion iirc.)
QOM view probably is too unstable for becoming ABI and as you noted
it might be a machine specific one. To be more generic and consumable
by libvirt it could be 'virtual' flat list
/machine/cpus/cpu-N-S-C-T<FOOO>[x] where FOOO could be
socket|core|thread depending on granularity at
which arch allows to create CPUs and N,S,C,T specifying 'where' part
that corresponds to link.

But I think separate QMP command  to list present/missing CPUs with
properties, would be easier to maintain and adapt to different archs
without need to commit part of QOM tree as ABI.

Note that I have not read this patch series yet, just some of the
alarming review comments.

Regards,
Andreas




.







reply via email to

[Prev in Thread] Current Thread [Next in Thread]