[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [RFC PATCH 0/4] ppc: nested TCG migration (KVM-on-TCG)

From: Mark Cave-Ayland
Subject: Re: [RFC PATCH 0/4] ppc: nested TCG migration (KVM-on-TCG)
Date: Thu, 24 Feb 2022 21:00:24 +0000
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.6.0

On 24/02/2022 18:58, Fabiano Rosas wrote:

This series implements the migration for a TCG pseries guest running a
nested KVM guest. This is just like migrating a pseries TCG guest, but
with some extra state to allow a nested guest to continue to run on
the destination.

Unfortunately the regular TCG migration scenario (not nested) is not
fully working so I cannot be entirely sure the nested migration is
correct. I have included a couple of patches for the general migration
case that (I think?) improve the situation a bit, but I'm still seeing
hard lockups and other issues with more than 1 vcpu.

This is more of an early RFC to see if anyone spots something right
away. I haven't made much progress in debugging the general TCG
migration case so if anyone has any input there as well I'd appreciate


Fabiano Rosas (4):
   target/ppc: TCG: Migrate tb_offset and decr
   spapr: TCG: Migrate spapr_cpu->prod
   hw/ppc: Take nested guest into account when saving timebase
   spapr: Add KVM-on-TCG migration support

  hw/ppc/ppc.c                    | 17 +++++++-
  hw/ppc/spapr.c                  | 19 ++++++++
  hw/ppc/spapr_cpu_core.c         | 77 +++++++++++++++++++++++++++++++++
  include/hw/ppc/spapr_cpu_core.h |  2 +-
  target/ppc/machine.c            | 61 ++++++++++++++++++++++++++
  5 files changed, 174 insertions(+), 2 deletions(-)

FWIW I noticed there were some issues with migrating the decrementer on Mac machines a while ago which causes a hang on the destination with TCG (for MacOS on a x86 host in my case). Have a look at the following threads for reference:


IIRC there is code that assumes any migration in PPC is being done live, and so adjusts the timebase on the destination to reflect wall clock time by recalculating tb_offset. I haven't looked at the code for a while but I think the outcome was that there needs to be 2 phases in migration: the first is to migrate the timebase as-is for guests that are paused during migration, whilst the second is to notify hypervisor-aware guest OSs such as Linux to make the timebase adjustment if required if the guest is running.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]