[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [RFC PATCH 0/4] ppc: nested TCG migration (KVM-on-TCG)

From: David Gibson
Subject: Re: [RFC PATCH 0/4] ppc: nested TCG migration (KVM-on-TCG)
Date: Fri, 25 Feb 2022 14:54:47 +1100

On Thu, Feb 24, 2022 at 09:00:24PM +0000, Mark Cave-Ayland wrote:
> On 24/02/2022 18:58, Fabiano Rosas wrote:
> > This series implements the migration for a TCG pseries guest running a
> > nested KVM guest. This is just like migrating a pseries TCG guest, but
> > with some extra state to allow a nested guest to continue to run on
> > the destination.
> > 
> > Unfortunately the regular TCG migration scenario (not nested) is not
> > fully working so I cannot be entirely sure the nested migration is
> > correct. I have included a couple of patches for the general migration
> > case that (I think?) improve the situation a bit, but I'm still seeing
> > hard lockups and other issues with more than 1 vcpu.
> > 
> > This is more of an early RFC to see if anyone spots something right
> > away. I haven't made much progress in debugging the general TCG
> > migration case so if anyone has any input there as well I'd appreciate
> > it.
> > 
> > Thanks
> > 
> > Fabiano Rosas (4):
> >    target/ppc: TCG: Migrate tb_offset and decr
> >    spapr: TCG: Migrate spapr_cpu->prod
> >    hw/ppc: Take nested guest into account when saving timebase
> >    spapr: Add KVM-on-TCG migration support
> > 
> >   hw/ppc/ppc.c                    | 17 +++++++-
> >   hw/ppc/spapr.c                  | 19 ++++++++
> >   hw/ppc/spapr_cpu_core.c         | 77 +++++++++++++++++++++++++++++++++
> >   include/hw/ppc/spapr_cpu_core.h |  2 +-
> >   target/ppc/machine.c            | 61 ++++++++++++++++++++++++++
> >   5 files changed, 174 insertions(+), 2 deletions(-)
> FWIW I noticed there were some issues with migrating the decrementer on Mac
> machines a while ago which causes a hang on the destination with TCG (for
> MacOS on a x86 host in my case). Have a look at the following threads for
> reference:
> https://lists.gnu.org/archive/html/qemu-devel/2016-01/msg00546.html
> https://lists.gnu.org/archive/html/qemu-devel/2016-01/msg04622.html
> IIRC there is code that assumes any migration in PPC is being done live, and
> so adjusts the timebase on the destination to reflect wall clock time by
> recalculating tb_offset. I haven't looked at the code for a while but I
> think the outcome was that there needs to be 2 phases in migration: the
> first is to migrate the timebase as-is for guests that are paused during
> migration, whilst the second is to notify hypervisor-aware guest OSs such as
> Linux to make the timebase adjustment if required if the guest is running.

Whether the timebase is adjusted for the migration downtime depends
whether the guest clock is pinned to wall clock time or not.  Usually
it should be (because you don't want your clocks to go wrong on
migration of a production system).  However in neither case should be
the guest be involved.

There may be guest side code related to this in Linux, but that's
probably for migration under pHyp, which is a guest aware migration
system.  That's essentially unrelated to migration under qemu/kvm,
which is a guest unaware system.

Guest aware migration has some nice-sounding advantages; in particular
itcan allow migrations across a heterogenous cluster with differences
between hosts that the hypervisor can't hide, or can't efficiently
hide.  However, it is IMO, a deeply broken approach, because it can
allow an un-cooperative guest to indefinitely block migration, and for
it to be reliably correct it requires *much* more pinning down of
exactly what host system changes the guest can and can't be expected
to cope with than PAPR has ever bothered to do.

David Gibson                    | I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
                                | _way_ _around_!

Attachment: signature.asc
Description: PGP signature

reply via email to

[Prev in Thread] Current Thread [Next in Thread]