qemu-ppc
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-ppc] POWER8 <-> POWER8E migration changed/broken in ppc-for.2.


From: David Gibson
Subject: Re: [Qemu-ppc] POWER8 <-> POWER8E migration changed/broken in ppc-for.2.10. Intended?
Date: Fri, 7 Jul 2017 12:09:29 +1000
User-agent: Mutt/1.8.0 (2017-02-23)

On Tue, Jul 04, 2017 at 08:20:12AM -0300, Daniel Henrique Barboza wrote:
> 
> 
> On 06/22/2017 09:46 PM, Alexey Kardashevskiy wrote:
> > On 22/06/17 18:19, David Gibson wrote:
> > > On Thu, Jun 22, 2017 at 09:08:55AM +0200, Thomas Huth wrote:
> > > > On 22.06.2017 05:22, Alexey Kardashevskiy wrote:
> > > > > On 22/06/17 12:49, Alexey Kardashevskiy wrote:
> > > > > > On 22/06/17 11:45, Alexey Kardashevskiy wrote:
> > > > > > > On 22/06/17 07:19, Daniel Henrique Barboza wrote:
> > > > > > > > Hi,
> > > > > > > > 
> > > > > > > > Yesterday I noticed that the migration between POWER8 and 
> > > > > > > > POWER8E hosts
> > > > > > > > isn't working
> > > > > > > > as before. These are the machines that I've been using for DRC 
> > > > > > > > testing and
> > > > > > > > I discovered
> > > > > > > > this behavior when trying to test the DRC cleanup series (v5).
> > > > > > > > 
> > > > > > > > What happens in that the migration seems to be completed - no 
> > > > > > > > errors on
> > > > > > > > both source and
> > > > > > > > destination, destination status marked as 'running' - but the 
> > > > > > > > guest at dest
> > > > > > > > looks
> > > > > > > > frozen/inactive. Migrating from POWER8 to POWER8E or the other 
> > > > > > > > way around
> > > > > > > > presents the same behavior.
> > > > > > > > 
> > > > > > > > This is the command line I'm executing:
> > > > > > > > 
> > > > > > > > sudo ./qemu-system-ppc64 -name migrate_qemu -boot strict=on 
> > > > > > > > -device
> > > > > > > > nec-usb-xhci,id=usb,bus=pci.0,addr=0xf -device
> > > > > > > > spapr-vscsi,id=scsi0,reg=0x2000 -smp
> > > > > > > > 1,maxcpus=4,sockets=4,cores=1,threads=1 --machine
> > > > > > > > pseries,accel=kvm,kvm-type=HV,usb=off,dump-guest-core=off -m
> > > > > > > > 4G,slots=32,maxmem=32G -drive
> > > > > > > > file=/home/danielhb/vm_imgs/ubuntu1704.qcow2,format=qcow2,if=none,id=drive-virtio-disk0,cache=none
> > > > > > > > -device
> > > > > > > > virtio-blk-pci,scsi=off,bus=pci.0,addr=0x2,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1
> > > > > > > > -nographic
> > > > > > > > 
> > > > > > > > I've tracked the problem down to the commit b0517f54ef7d, "ppc: 
> > > > > > > > Rework CPU
> > > > > > > > compatibility testing across migration". If I revert this 
> > > > > > > > commit, migration
> > > > > > > > works as before
> > > > > > > > between these hosts.
> > > > > > > With b0517f54ef7d, CPU state is not marked dirty and SPRs (all 
> > > > > > > registers in
> > > > > > > fact) are not put to KVM at the end of migration, hence the 
> > > > > > > freeze, this
> > > > > > > has something to do with cpu_synchronize_state() called from 
> > > > > > > ppc_set_compat().
> > > > > > No, I was not right, please ignore.
> > > > > 
> > > > > This hack fixes it:
> > > > > 
> > > > > 
> > > > > diff --git a/target/ppc/kvm.c b/target/ppc/kvm.c
> > > > > index f2f7c531bc..7ec4146b2b 100644
> > > > > --- a/target/ppc/kvm.c
> > > > > +++ b/target/ppc/kvm.c
> > > > > @@ -919,6 +919,7 @@ static int kvm_put_vpa(CPUState *cs)
> > > > >       return 0;
> > > > >   }
> > > > >   #endif /* TARGET_PPC64 */
> > > > > +static uint32_t mfpvr(void);
> > > > > 
> > > > >   int kvmppc_put_books_sregs(PowerPCCPU *cpu)
> > > > >   {
> > > > > @@ -927,7 +928,7 @@ int kvmppc_put_books_sregs(PowerPCCPU *cpu)
> > > > >       int i;
> > > > > 
> > > > >       sregs.pvr = env->spr[SPR_PVR];
> > > > > -
> > > > > +    sregs.pvr = mfpvr();
> > > > >       sregs.u.s.sdr1 = env->spr[SPR_SDR1];
> > > > > 
> > > > >       /* Sync SLB */
> > > > In case you consider to submit this patch: Please add a check for
> > > > !kvmppc_is_pr() so that this is only done for KVM-HV.
> > > > 
> > > > But I think the right way to hack this is somewhere in machine.c
> > > > instead, eg. in cpu_post_load().
> > > Is the problem is just that we're not considering POWER8 to be
> > > compatible enough with POWER8E, then we should be altering the
> > > pvr_match function so it does.
> > 
> > The problem is that the KVM HV expects the exact host's PVR in
> > KVM_SET_SREGS, other than that POWER8 == POWER8E pvr_match-wise afaict.
> > 
> > And Paul suggested that this will bite us again when we start really
> > worrying about migrating POWER8 to POWER9 so we should probably drop that
> > PVR check in the KVM at all, in meanwhile we should set PVR to the
> > destination host PVR value sometime after all these pvr_match() checks, the
> > place above was just my first guess (which is usually incorrect, I know :) )
> > 
> 
> I have no problems with this solution - we can re-insert the line
> 
>         env->spr[SPR_PVR] = env->spr_cb[SPR_PVR].default_value;
> 
> 
> after all the pvr_checks were made.
> 
> Is this an acceptable workaround or do we need to dig further? Alexey, do
> you want to send this patch?

I'm very uneasy with this approach.  For one thing it relies on
knowing if we're using HV or PR KVM, which we should be avoiding.

But more importantly, it denies the kernel an opportunity to assess if
the CPU it's hosting on really is a good enough approximation of the
one that qemu really wants.

My intention here - right since the early days of KVM HV, was that
qemu would always put the PVR it really wanted into the sregs field.
KVM PR would then emulate a specific CPU based on that.  HV, not
having that option, would instead assess if the requested CPU was
close enough to the host CPU to get away with it.

But I guess, since qemu's haphazard handling of compat modes meant
that didn't really happen until now, we never properly tested that.

So, really I think the correct fix is in the kernel - to make it
accept a POWER8E PVR when hosted on a POWER8 and vice versa.  But,
given the kernels that are out there, I guess we'll need a workaround
in qemu as well.  Ugh.

-- 
David Gibson                    | I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
                                | _way_ _around_!
http://www.ozlabs.org/~dgibson

Attachment: signature.asc
Description: PGP signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]