qemu-ppc
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Qemu-ppc] POWER8 <-> POWER8E migration changed/broken in ppc-for.2.10.


From: Daniel Henrique Barboza
Subject: [Qemu-ppc] POWER8 <-> POWER8E migration changed/broken in ppc-for.2.10. Intended?
Date: Wed, 21 Jun 2017 18:19:15 -0300
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.1.0

Hi,

Yesterday I noticed that the migration between POWER8 and POWER8E hosts isn't working as before. These are the machines that I've been using for DRC testing and I discovered
this behavior when trying to test the DRC cleanup series (v5).

What happens in that the migration seems to be completed - no errors on both source and destination, destination status marked as 'running' - but the guest at dest looks
frozen/inactive. Migrating from POWER8 to POWER8E or the other way around
presents the same behavior.

This is the command line I'm executing:

sudo ./qemu-system-ppc64 -name migrate_qemu -boot strict=on -device nec-usb-xhci,id=usb,bus=pci.0,addr=0xf -device spapr-vscsi,id=scsi0,reg=0x2000 -smp 1,maxcpus=4,sockets=4,cores=1,threads=1 --machine pseries,accel=kvm,kvm-type=HV,usb=off,dump-guest-core=off -m 4G,slots=32,maxmem=32G -drive file=/home/danielhb/vm_imgs/ubuntu1704.qcow2,format=qcow2,if=none,id=drive-virtio-disk0,cache=none -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x2,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1 -nographic

I've tracked the problem down to the commit b0517f54ef7d, "ppc: Rework CPU
compatibility testing across migration". If I revert this commit, migration works as before
between these hosts.

After some debugging I've found that the code is going through the code block that
is supposed to be executed only in compat mode:

#if defined(TARGET_PPC64)
    if (cpu->compat_pvr) {
        Error *local_err = NULL;

        ppc_set_compat(cpu, cpu->compat_pvr, &local_err);
        if (local_err) {
            error_report_err(local_err);
            error_free(local_err);
            return -1;
        }
    } else
#endif

I've talked with Mike about it and it doesn't make much sense going through
this IF because cpu->compat_pvr is supposed NULL when running in
non-compat mode. Mike found out that cpu->compat_pvr is being
set by the guest in CAS in h_client_architecture_support():

/* Update CPUs */
    if (cpu->compat_pvr != cas_pvr) {
        ppc_set_compat_all(cas_pvr, &local_err);
        if (local_err) {
            error_report_err(local_err);
            return H_HARDWARE;
        }
    }

So there's that. However, even if I hack it to not go there and go to the
"if (!pvr_match(cpu, env->spr[SPR_PVR]))" block,  this migration behavior
change still happens. It is worth noticing that pvr_match() returns TRUE
in my scenario.

The way I've found to make migration work as before is to set env->spr[SPR_PVR]
like it was before the patch:

        env->spr[SPR_PVR] = env->spr_cb[SPR_PVR].default_value;

In this line, env->spr_cb[SPR_PVR].default_value is the PVR value of the
destination CPU. This isn't being set after the patch or anywhere in the
newly added code.  Adding this statement after the added IF block makes
migration work as before, even if I don't hack the code to avoid the compat block.

I am sending this email instead of a patch because (1) I am not sure if this is
an intended/expected behavior and there are more patches to be
applied on top of this one and (2) I bet that the env->spr[SPR_PVR] = ....
line was removed for a reason and adding it back isn't a proper fix.


Thanks,


Daniel




reply via email to

[Prev in Thread] Current Thread [Next in Thread]