[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [Query] Live Migration between machines with different

From: Juan Quintela
Subject: Re: [Qemu-devel] [Query] Live Migration between machines with different processor ids
Date: Tue, 04 Sep 2018 12:27:29 +0200
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/26.1 (gnu/linux)

Andrew Jones <address@hidden> wrote:
> On Tue, Sep 04, 2018 at 09:16:58AM +0000, Jaggi, Manish wrote:
>> So which approach should be taken here, whats your take...

[ Remoning Anthony form CC.  Address don't exist anymore ]

> Inventing a base-AArch64 cpu model that can then be extended with optional
> features is a nice way to extend the migratability of a guest, however
> it's hard to do because of errata. Since errata workarounds are enabled
> per MIDR, then we'd need to invent our own MIDR and also some way to
> communicate which errata we want to enable, possibly through some paravirt
> mechanism or through some implementation defined system registers that
> KVM would need to reserve and define.
> That's not just a ton of work for the entire virt stack (not just KVM and
> QEMU, but also all the layers above), but it's possible that it won't be
> useful in the end anyway. There's risk that enabling just one erratum
> workaround would restrict the guest to hosts of the exact same type
> anyway. For each erratum that needs to be enabled, the probability of
> enabling an incompatible one goes up, so it may not be likely to do much
> better than '-cpu host' in the end. I'm afraid that until errata are
> primarily showing up in optional CPU features that can simply be disabled
> for the workaround, that we're stuck with '-cpu host'. I'd be happy to
> discuss it more though.

Then, we are basically at the point when we can only migrate to the
exact same processor, no?

> In short, I'd go with the proposal above, for now, with possibly one
> change. libvirt folk (Andrea Bolognani and Pino Toscano) suggest that
> the guest invariant register updating on the destination host only be
> done if the user opts-in to it. This is because right now if a user
> tries to migrate to a host that is not 100% identical the migration
> will fail, which makes the "mistake" clear. If we silently change the
> behavior to allow it, then what could have been a mistake, because
> the hosts aren't actually "close enough", may go unnoticed. I'm not
> 100% sure we need another user opt-in flag to be set, though, as I
> think the '-cpu host' indicates the user expects the VCPU to look
> like the host CPU, and even after migration that expectation should be
> met. Simply, users that migrate '-cpu host' VMs need to know what they're
> doing.

I don't know really what to say here:
- on the one hand, not creating the proper cpu types is going to bite
  us, big time, later.
- on the other hand, it appears that cpu compatibility is not so
  "strong", "nice", or whatever do you want to call it on ARM land.

Why I am so worried?  Because we have spent lots (and I mean lots) of
time on x86_64 when we forgot to enable/disable/indicate that one cpu
has a new MSR/feature.  Normal problem is that it only happens when
customer is using that particular feature.  It has been the case for us
that to reporduce the problem we had to ping-pong migration several
hundred times between two different cpus until we found _why_ it failed.

So, I am pretty sure that:
a- doing it right is a lot of work now.
b- doing it fast now is a lot of work now, and much more work later.

Later, Juan.

reply via email to

[Prev in Thread] Current Thread [Next in Thread]