qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] -cpu host (was Re: KVM call minutes for 2013-08-06)


From: Peter Maydell
Subject: Re: [Qemu-devel] -cpu host (was Re: KVM call minutes for 2013-08-06)
Date: Thu, 8 Aug 2013 19:20:41 +0100

On 8 August 2013 16:55, Andreas Färber <address@hidden> wrote:
> Am 08.08.2013 14:51, schrieb Peter Maydell:
>> So, coming at this from an ARM perspective:
>> Should any target arch that supports KVM also support "-cpu host"?
>> If so, what should it do?
>
> I think that depends on the target and whether/what is useful.

The most immediate problem we have is we don't want to have
to give QEMU a lot of info about v8 CPUs which it doesn't
really need to have just in order to start a VM; I think
-cpu host would fix that particular problem.

>> Is there a description somewhere of
>> what the x86 and PPC semantics of -cpu host are?
>
> I'm afraid our usual documentation will be reading the source code. ;)
>
> x86 was first to implement -cpu host and passed through pretty much all
> host features even if they would not work without additional support
> code. I've seen a bunch of bugs where that leads to GMP and others
> breaking badly. Lately in the case of PMU we've started to limit that.
> Alex proposed -cpu best, which was never merged to date. It was similar
> to how ppc's -cpu host works:
>
> ppc matches the Processor Version Register (PVR) in kvm.c against its
> known models from cpu-models.c (strictly today, mask being discussed).
> The PVR can be read from userspace via mfpvr alias to mfspr (Move From
> Special Purpose Register; possibly emulated for userspace by kernel?).
> CPU features are all QEMU-driven AFAIU, through the "CPU families" in
> translate_init.c. Beware, everything is highly macro'fied in ppc code.

In theory we could do a similar thing for ARM (pull the CPU
implementer/part numbers out of cpuinfo and match them against
QEMU's list of known CPUs). However that means you can't run
KVM on a CPU which QEMU doesn't know about, which was one
of the reasons for the approach I suggested below.

>> For ARM you can't get at feature info of the host from userspace
>> (unless you want to get into parsing /proc/cpuinfo), so my current
>> idea is to have KVM_ARM_VCPU_INIT support a target-cpu-type
>> which means "whatever host CPU is". Then when we've created the
>> vcpu we can populate QEMU's idea of what the CPU features are
>> by using the existing ioctls for reading the cp15 registers of
>> the vcpu.
>
> Sounds sane to me iff those cp15 registers all work with KVM and don't
> need any additional KVM/QEMU/device code.

Yes; KVM won't tell us about CP15 registers unless they
are exposed to the guest VM (that is, we're querying the
VCPU, not the host CPU). More generally, the cp15 "tuple
list" code I landed a couple of months back makes the kernel
the authoritative source for which cp15 registers exist and
what their values are -- in -enable-kvm mode QEMU no longer
cares about them (its own list of which registers exist for
which CPU is used only for TCG).

>> The other unresolved thing is what "-cpu host" ought to mean
>> for the CPU's on-chip peripherals (of which the major one is
>> the interrupt controller) -- if the host is an A57 should
>> this imply that you always get the A57's GICv3, or is it OK
>> to provide an A57 with a GICv2? At the moment QEMU models the
>> per-cpu peripherals in a somewhat more semi-detached fashion
>> than is the case in silicon, treating them as more a part
>> of the board model than of the cpu itself.
>
> Feel free to submit patches changing that. Prerequisite should
> then be to have those devices be pure TYPE_DEVICE rather than
> TYPE_SYS_BUS_DEVICE, or otherwise you'll run into the same
> hot-plug trap as we did with the x86 APIC (we had to invent a
> hotpluggable ICC bus as interim solution).

Mmm. I'm not sure what cpu hotplug should be in the ARM world
since obviously you can't hotplug a SoC (one possibility is
that we don't actually hotplug CPUs, we just create N of them
but leave most of them "powered off" via a power-control API
like PSCI).

>> Having '-cpu host'
>> not affect them might be the pragmatic choice, since it fits
>> with what QEMU currently does and with kernel-side situations
>> where the host CPU may only be able to show the guest VM a
>> GICv2 view of the world (or only a GICv3, as the case may be).
>> For this to work it does require that guests figure out what
>> their per-cpu peripherals are by looking at the device tree
>> rather than saying "oh, this is an A57, I know all A57s
>> have this", of course...
>
> Without directly answering the question and continuing from above, my
> personal view has been that we need to get away from the current CPU
> model to a) how hardware is structured and b) how we want to have things
> behave in virtualized environments.
>
> Take x86 as an example: CPUState corresponds to a hyperthread today, but
> we want hotplug to work like it does on a physical machine: hot-adding
> on socket-level only. Beyond just building the topology with Container
> objects, that means having a Xeon-X5-4242 object that has-a CPU core
> has-a CPU thread and any devices the particular layers bring along.
>
> For SoCs I have been proposing - for sh7750 and lately tegra2 - to model
> "the black chip on the board" as a TYPE_DEVICE for encapsulation across
> boards. Meaning the GIC would no longer be instantiated on the board but
> as part of an object, and -smp and -cpu would as a consequence loose in
> influence.

Yes, I agree with this as a general approach.

> We could interpret -cpu host as instantiate the host's SoC object. But
> the mainstream SoC for KVM virtualization is exynos5, and no one sat
> down to model exynos5 in QEMU so far, so that would be moot. Versatile
> Express is rather unlikely to match the host environment KVM is used in,
> and when using Soft Macros (or what ARM calls their FPGA-based
> emulation) then things get fuzzy anyway.

Agreed that '-cpu host' shouldn't instantiate a whole SoC. I think
the most useful behaviour would be that (for example) an A15 SoC
model should permit only either "-cpu cortex-a15" (pointless but
preserves backwards compatibility for command lines) or "-cpu host"
(only allowed when KVM enabled, possibly including a check that the
host CPU is 'close enough' to the SoC CPU, if we can define what
we mean by 'close enough'...).

> Similar problem for CPU hotplug: there is no real match in physical ARM
> hardware that we can copy for KVM/QEMU. It's all mixed in one chip where
> we can only enable/disable things via MMIO in physical reality.

This is why I like the idea of addressing the "give this VM more/fewer
CPUs" requirement via implementing power control and PSCI : it
actually does match what the hardware does to give more or fewer
cores to an OS. However I don't know what ARM server hardware
is likely to do in the way of hotplug...

> You recently proposed to have the CPUs in the a*mpcore_priv object,
> which also happens to own the GIC. Having the CPU model be a property of
> a*mpcore would complicate a lot of things QOM-wise but for the question
> at hand would allow to exchange the GIC based on CPU model, so I'm
> undecided. a9mpcore_priv with a cortex-a15 doesn't make much sense
> though, given there's a15mpcore_priv with different amount of IRQs and
> less/different child devices.

The patches I sent out today that get rid of arm_pic ought to
make it a little easier to move CPUs into the a*mpcore containers
if we want to go down that path.

> Given that ARM SoCs are much less standardized then x86 PCs, I would
> conclude that passing random CPUs into a board/SoC does not make sense
> and should at least be limited to known-good combinations such as
> Cortex-A7 vs. Cortex-A15.

I totally agree with this. The only reason we don't error out more
than we do already is a combination of inertia and not having a
nice infrastructure for boards to put limits on what the user can
pass on the command line [compare also "how does a board say it
needs a kernel or a flash image" and "how does a board say that
it can only handle up to 512MB of RAM"...]

-- PMM



reply via email to

[Prev in Thread] Current Thread [Next in Thread]