[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [PATCH v3 3/3] migration/doc: We broke backwards compatibility
|
From: |
Juan Quintela |
|
Subject: |
Re: [PATCH v3 3/3] migration/doc: We broke backwards compatibility |
|
Date: |
Mon, 23 Oct 2023 13:09:06 +0200 |
|
User-agent: |
Gnus/5.13 (Gnus v5.13) Emacs/28.3 (gnu/linux) |
Fiona Ebner <f.ebner@proxmox.com> wrote:
> Am 15.05.23 um 10:32 schrieb Juan Quintela:
>> When we detect that we have broken backwards compantibility in a
>
> compatibility
done
> (...)
>
>> +
>> +In qemu-8.0 we got this commit: ::
>> +
>> + commit 9a6ef182c03eaa138bae553f0fbb5a123bef9a53
>> + Author: Jonathan Cameron <Jonathan.Cameron@huawei.com>
>> + Date: Thu Mar 2 13:37:03 2023 +0000
>> +
>> + hw/pci/aer: Add missing routing for AER errors
>> +
>> +The relevant bits of the commit for our example are this ones:
>> +
>> + --- a/hw/pci/pcie_aer.c
>> + +++ b/hw/pci/pcie_aer.c
>> + @@ -112,6 +112,10 @@ int pcie_aer_init(PCIDevice *dev,
>> +
>> + pci_set_long(dev->w1cmask + offset + PCI_ERR_UNCOR_STATUS,
>> + PCI_ERR_UNC_SUPPORTED);
>> + + pci_set_long(dev->config + offset + PCI_ERR_UNCOR_MASK,
>> + + PCI_ERR_UNC_MASK_DEFAULT);
>> + + pci_set_long(dev->wmask + offset + PCI_ERR_UNCOR_MASK,
>> + + PCI_ERR_UNC_SUPPORTED);
>> +
>> + pci_set_long(dev->config + offset + PCI_ERR_UNCOR_SEVER,
>> + PCI_ERR_UNC_SEVERITY_DEFAULT);
>> +
>
> These changes are not part of commit
> 9a6ef182c0 ("hw/pci/aer: Add missing routing for AER errors")
> but rather the one before it, namely
> 010746ae1d ("hw/pci/aer: Implement PCI_ERR_UNCOR_MASK register")
grr, will fix that.
>> +The patch changes how we configure pci space for AER. But qemu fails
>
> Should QEMU and PCI be capitalized in the text parts?
I think that I changed all
>> +when the pci space configuration is different betwwen source and
>
> between
done.
>> +destination.
>> +
>> +The following commit show how this got fixed:
>
> shows
done
> (...)
>
>> +
>> +So the normality has been restaured and everything is ok, no?
>
> restored
done
>> +
>> +Not really, now our matrix is much bigger. We started with the easy
>> +cases, migration from the same version to the same version always
>> +works:
>> +
>> +- $ qemu-7.2 -M pc-7.2 -> qemu-7.2 -M pc-7.2
>> +- $ qemu-8.0 -M pc-7.2 -> qemu-8.0 -M pc-7.2
>> +- $ qemu-8.0.1 -M pc-7.2 -> qemu-8.0.1 -M pc-7.2
>> +
>> +Now the interesting ones. When the QEMU processes versions are
>> +different. For the 1st set, their fail and we can do nothing, both
>> +versions are relased and we can't change anything.
>
> released
done
>> +
>> +- $ qemu-7.2 -M pc-7.2 -> qemu-8.0 -M pc-7.2
>> +- $ qemu-8.0 -M pc-7.2 -> qemu-7.2 -M pc-7.2
>> +
>> +This two are the ones that work. The whole point of making the
>> +change in qemu-8.0.1 release was to fix this issue:
>> +
>> +- $ qemu-7.2 -M pc-7.2 -> qemu-8.0.1 -M pc-7.2
>> +- $ qemu-8.0.1 -M pc-7.2 -> qemu-7.2 -M pc-7.2
>> +
>> +But now we found that qemu-8.0 neither can migrate to qemu-7.2 not
>> +qemu-8.0.1.
>> +
>> +- $ qemu-8.0 -M pc-7.2 -> qemu-8.0.1 -M pc-7.2
>> +- $ qemu-8.0.1 -M pc-7.2 -> qemu-8.0 -M pc-7.2
>> +
>> +So, if we start a pc-7.2 machine in qemu-8.0 we can't migrate it to
>> +anything except to qemu-8.0.
>> +
>> +Can we do better?
>> +
>> +Yeap. If we know that we are gonig to do this migration:
>
> going
done
>> +
>> +- $ qemu-8.0 -M pc-7.2 -> qemu-8.0.1 -M pc-7.2
>> +
>> +We can launche the appropiate devices with
>
> "launch" was already pointed out by Peter, but there's also "appropriate"
done
>> +
>> +--device...,x-pci-e-err-unc-mask=on
>> +
>> +And now we can receive a migration from 8.0. And from now on, we can
>> +do that migration to new machine types if we remember to enable that
>> +property for pc-7.2. Notice that we need to remember, it is not
>> +enough to know that the source of the migration is qemu-8.0. Think of this
>> example:
>> +
>> +$ qemu-8.0 -M pc-7.2 -> qemu-8.0.1 -M pc-7.2 -> qemu-8.2 -M pc-7.2
>> +
>> +In the second migration, the source is not qemu-8.0, but we still have
>> +that "problem" and have that property enabled. Notice that we need to
>> +continue having this mark/property until we have this machine
>> +rebooted. But it is not a normal reboot (that don't reload qemu) we
>> +need the mapchine to poweroff/poweron on a fixed qemu. And from now
>
> machine
done
Thanks a lot.