qemu-ppc
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-ppc] POWER8 <-> POWER8E migration changed/broken in ppc-for.2.


From: Alexey Kardashevskiy
Subject: Re: [Qemu-ppc] POWER8 <-> POWER8E migration changed/broken in ppc-for.2.10. Intended?
Date: Thu, 22 Jun 2017 12:49:47 +1000
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.1.0

On 22/06/17 11:45, Alexey Kardashevskiy wrote:
> On 22/06/17 07:19, Daniel Henrique Barboza wrote:
>> Hi,
>>
>> Yesterday I noticed that the migration between POWER8 and POWER8E hosts
>> isn't working
>> as before. These are the machines that I've been using for DRC testing and
>> I discovered
>> this behavior when trying to test the DRC cleanup series (v5).
>>
>> What happens in that the migration seems to be completed - no errors on
>> both source and
>> destination, destination status marked as 'running' - but the guest at dest
>> looks
>> frozen/inactive. Migrating from POWER8 to POWER8E or the other way around
>> presents the same behavior.
>>
>> This is the command line I'm executing:
>>
>> sudo ./qemu-system-ppc64 -name migrate_qemu -boot strict=on -device
>> nec-usb-xhci,id=usb,bus=pci.0,addr=0xf -device
>> spapr-vscsi,id=scsi0,reg=0x2000 -smp
>> 1,maxcpus=4,sockets=4,cores=1,threads=1 --machine
>> pseries,accel=kvm,kvm-type=HV,usb=off,dump-guest-core=off -m
>> 4G,slots=32,maxmem=32G -drive
>> file=/home/danielhb/vm_imgs/ubuntu1704.qcow2,format=qcow2,if=none,id=drive-virtio-disk0,cache=none
>> -device
>> virtio-blk-pci,scsi=off,bus=pci.0,addr=0x2,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1
>> -nographic
>>
>> I've tracked the problem down to the commit b0517f54ef7d, "ppc: Rework CPU
>> compatibility testing across migration". If I revert this commit, migration
>> works as before
>> between these hosts.
> 
> With b0517f54ef7d, CPU state is not marked dirty and SPRs (all registers in
> fact) are not put to KVM at the end of migration, hence the freeze, this
> has something to do with cpu_synchronize_state() called from ppc_set_compat().

No, I was not right, please ignore.


> 
> 
> 
>>
>> After some debugging I've found that the code is going through the code
>> block that
>> is supposed to be executed only in compat mode:
>>
>> #if defined(TARGET_PPC64)
>>     if (cpu->compat_pvr) {
>>         Error *local_err = NULL;
>>
>>         ppc_set_compat(cpu, cpu->compat_pvr, &local_err);
>>         if (local_err) {
>>             error_report_err(local_err);
>>             error_free(local_err);
>>             return -1;
>>         }
>>     } else
>> #endif
>>
>> I've talked with Mike about it and it doesn't make much sense going through
>> this IF because cpu->compat_pvr is supposed NULL when running in
>> non-compat mode. Mike found out that cpu->compat_pvr is being
>> set by the guest in CAS in h_client_architecture_support():
>>
>> /* Update CPUs */
>>     if (cpu->compat_pvr != cas_pvr) {
>>         ppc_set_compat_all(cas_pvr, &local_err);
>>         if (local_err) {
>>             error_report_err(local_err);
>>             return H_HARDWARE;
>>         }
>>     }
>>
>> So there's that. However, even if I hack it to not go there and go to the
>> "if (!pvr_match(cpu, env->spr[SPR_PVR]))" block,  this migration behavior
>> change still happens. It is worth noticing that pvr_match() returns TRUE
>> in my scenario.
>>
>> The way I've found to make migration work as before is to set
>> env->spr[SPR_PVR]
>> like it was before the patch:
>>
>>         env->spr[SPR_PVR] = env->spr_cb[SPR_PVR].default_value;
>>
>> In this line, env->spr_cb[SPR_PVR].default_value is the PVR value of the
>> destination CPU. This isn't being set after the patch or anywhere in the
>> newly added code.  Adding this statement after the added IF block makes
>> migration work as before, even if I don't hack the code to avoid the compat
>> block.
>>
>> I am sending this email instead of a patch because (1) I am not sure if
>> this is
>> an intended/expected behavior and there are more patches to be
>> applied on top of this one and (2) I bet that the env->spr[SPR_PVR] = ....
>> line was removed for a reason and adding it back isn't a proper fix.
>>
>>
>> Thanks,
>>
>>
>> Daniel
>>
>>
> 
> 


-- 
Alexey



reply via email to

[Prev in Thread] Current Thread [Next in Thread]