qemu-ppc
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-ppc] POWER8 <-> POWER8E migration changed/broken in ppc-for.2.


From: Alexey Kardashevskiy
Subject: Re: [Qemu-ppc] POWER8 <-> POWER8E migration changed/broken in ppc-for.2.10. Intended?
Date: Thu, 22 Jun 2017 13:22:20 +1000
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.1.0

On 22/06/17 12:49, Alexey Kardashevskiy wrote:
> On 22/06/17 11:45, Alexey Kardashevskiy wrote:
>> On 22/06/17 07:19, Daniel Henrique Barboza wrote:
>>> Hi,
>>>
>>> Yesterday I noticed that the migration between POWER8 and POWER8E hosts
>>> isn't working
>>> as before. These are the machines that I've been using for DRC testing and
>>> I discovered
>>> this behavior when trying to test the DRC cleanup series (v5).
>>>
>>> What happens in that the migration seems to be completed - no errors on
>>> both source and
>>> destination, destination status marked as 'running' - but the guest at dest
>>> looks
>>> frozen/inactive. Migrating from POWER8 to POWER8E or the other way around
>>> presents the same behavior.
>>>
>>> This is the command line I'm executing:
>>>
>>> sudo ./qemu-system-ppc64 -name migrate_qemu -boot strict=on -device
>>> nec-usb-xhci,id=usb,bus=pci.0,addr=0xf -device
>>> spapr-vscsi,id=scsi0,reg=0x2000 -smp
>>> 1,maxcpus=4,sockets=4,cores=1,threads=1 --machine
>>> pseries,accel=kvm,kvm-type=HV,usb=off,dump-guest-core=off -m
>>> 4G,slots=32,maxmem=32G -drive
>>> file=/home/danielhb/vm_imgs/ubuntu1704.qcow2,format=qcow2,if=none,id=drive-virtio-disk0,cache=none
>>> -device
>>> virtio-blk-pci,scsi=off,bus=pci.0,addr=0x2,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1
>>> -nographic
>>>
>>> I've tracked the problem down to the commit b0517f54ef7d, "ppc: Rework CPU
>>> compatibility testing across migration". If I revert this commit, migration
>>> works as before
>>> between these hosts.
>>
>> With b0517f54ef7d, CPU state is not marked dirty and SPRs (all registers in
>> fact) are not put to KVM at the end of migration, hence the freeze, this
>> has something to do with cpu_synchronize_state() called from 
>> ppc_set_compat().
> 
> No, I was not right, please ignore.


This hack fixes it:


diff --git a/target/ppc/kvm.c b/target/ppc/kvm.c
index f2f7c531bc..7ec4146b2b 100644
--- a/target/ppc/kvm.c
+++ b/target/ppc/kvm.c
@@ -919,6 +919,7 @@ static int kvm_put_vpa(CPUState *cs)
     return 0;
 }
 #endif /* TARGET_PPC64 */
+static uint32_t mfpvr(void);

 int kvmppc_put_books_sregs(PowerPCCPU *cpu)
 {
@@ -927,7 +928,7 @@ int kvmppc_put_books_sregs(PowerPCCPU *cpu)
     int i;

     sregs.pvr = env->spr[SPR_PVR];
-
+    sregs.pvr = mfpvr();
     sregs.u.s.sdr1 = env->spr[SPR_SDR1];

     /* Sync SLB */



It works because otherwise the KVM HV's KVM_SET_SREGS handler checks if PVR
is exactly the same as the running host and returns -EINVAL in
kvmppc_put_books_sregs() which prevents kvm_arch_put_registers() from
completing the job.



> 
> 
>>
>>
>>
>>>
>>> After some debugging I've found that the code is going through the code
>>> block that
>>> is supposed to be executed only in compat mode:
>>>
>>> #if defined(TARGET_PPC64)
>>>     if (cpu->compat_pvr) {
>>>         Error *local_err = NULL;
>>>
>>>         ppc_set_compat(cpu, cpu->compat_pvr, &local_err);
>>>         if (local_err) {
>>>             error_report_err(local_err);
>>>             error_free(local_err);
>>>             return -1;
>>>         }
>>>     } else
>>> #endif
>>>
>>> I've talked with Mike about it and it doesn't make much sense going through
>>> this IF because cpu->compat_pvr is supposed NULL when running in
>>> non-compat mode. Mike found out that cpu->compat_pvr is being
>>> set by the guest in CAS in h_client_architecture_support():
>>>
>>> /* Update CPUs */
>>>     if (cpu->compat_pvr != cas_pvr) {
>>>         ppc_set_compat_all(cas_pvr, &local_err);
>>>         if (local_err) {
>>>             error_report_err(local_err);
>>>             return H_HARDWARE;
>>>         }
>>>     }
>>>
>>> So there's that. However, even if I hack it to not go there and go to the
>>> "if (!pvr_match(cpu, env->spr[SPR_PVR]))" block,  this migration behavior
>>> change still happens. It is worth noticing that pvr_match() returns TRUE
>>> in my scenario.
>>>
>>> The way I've found to make migration work as before is to set
>>> env->spr[SPR_PVR]
>>> like it was before the patch:
>>>
>>>         env->spr[SPR_PVR] = env->spr_cb[SPR_PVR].default_value;
>>>
>>> In this line, env->spr_cb[SPR_PVR].default_value is the PVR value of the
>>> destination CPU. This isn't being set after the patch or anywhere in the
>>> newly added code.  Adding this statement after the added IF block makes
>>> migration work as before, even if I don't hack the code to avoid the compat
>>> block.
>>>
>>> I am sending this email instead of a patch because (1) I am not sure if
>>> this is
>>> an intended/expected behavior and there are more patches to be
>>> applied on top of this one and (2) I bet that the env->spr[SPR_PVR] = ....
>>> line was removed for a reason and adding it back isn't a proper fix.
>>>
>>>
>>> Thanks,
>>>
>>>
>>> Daniel
>>>
>>>
>>
>>
> 
> 


-- 
Alexey



reply via email to

[Prev in Thread] Current Thread [Next in Thread]