Re: ARM Snapshots Not Backwards-Compatible

From: Philippe Mathieu-Daudé
Subject: Re: ARM Snapshots Not Backwards-Compatible
Date: Wed, 3 Feb 2021 09:21:18 +0100
Cc'ing migration team and qemu-arm@ list.

On 2/3/21 5:01 AM, Aaron Lindsay wrote:
> Hello,
> I'm attempting to restore an AArch64 snapshot taken on QEMU 4.1.0 on
> QEMU 5.2.0, using system mode. My previous impression, possibly from
> https://wiki.qemu.org/Features/Migration/Troubleshooting#Basics was that
> this ought to work:
>> Note that QEMU supports migrating forward between QEMU versions
> Note that I'm using qemu-system-aarch64 with -loadvm.
> However, I've run into several issues I thought I should report. The
> first of them was that this commit changed the address of CBAR, which
> resulted in a mismatch of the register IDs in `cpu_post_load` in
> target/arm/machine.c:
> https://patchwork.kernel.org/project/qemu-devel/patch/20190927144249.29999-2-peter.maydell@linaro.org/
> The second was that several system registers have changed which bits are
> allowed to be written in different circumstances, seemingly as a result
> of a combination of bugfixes and implementation of additional behavior.
> These hit errors detected in `write_list_to_cpustate` in
> target/arm/helper.c.
> The third is that meanings of the bits in env->features (as defined by
> `enum arm_features` in target/arm/cpu.h) has shifted. For example,
> ARM_FEATURE_VFP4 have all been removed and ARM_FEATURE_V8_1M has been
> added since 4.1.0. Heck, even I have added a field there in the past.
> Unfortunately, these additions/removals mean that when env->features is
> saved on one version and restored on another the bits can mean different
> things. Notably, the removal of the *VFP features means that a snapshot
> of a CPU reporting it supports ARM_FEATURE_VFP3 on 4.1.0 thinks it's now
> ARM_FEATURE_M on 5.2.0!
> My guess, given the numerous issues and the additional complexity
> required to properly implement backwards-compatible snapshotting, is
> that this is not a primary goal. However, if it is a goal, what steps
> can/should we take to support it more thoroughly?
> Thanks!
> -Aaron
> p.s. Now for an admission: the snapshots I'm testing with were
> originally taken with `-cpu max`. This was unintentional, and I
> understand if the response is that I can't expect `-cpu max` checkpoints
> to work across QEMU versions... but I also don't think that all of these
> issues can be blamed on that (notably CBAR and env->features).

