[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-ppc] broken incoming migration

From: Alexey Kardashevskiy
Subject: Re: [Qemu-ppc] broken incoming migration
Date: Thu, 30 May 2013 19:31:53 +1000
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:17.0) Gecko/20130514 Thunderbird/17.0.6

On 05/30/2013 07:08 PM, Peter Lieven wrote:
> Am 30.05.2013 10:18, schrieb Alexey Kardashevskiy:
>> On 05/30/2013 05:49 PM, Paolo Bonzini wrote:
>>> Il 30/05/2013 09:44, Alexey Kardashevskiy ha scritto:
>>>> Hi!
>>>> I found the migration broken on pseries platform, specifically, this patch
>>>> broke it:
>>>> f1c72795af573b24a7da5eb52375c9aba8a37972
>>>> migration: do not sent zero pages in bulk stage
>>>> The idea is not to send zero pages to the destination guest which is
>>>> expected to have 100% empty RAM.
>>>> However on pseries plaftorm the guest always has some stuff in the RAM as a
>>>> part of initialization (device tree, system firmware and rtas (?)) so it is
>>>> not completely empty. As the source guest cannot detect this, it skips some
>>>> pages during migration and we get a broken destination guest. Bug.
>>>> While the idea is ok in general, I do not see any easy way to fix it as
>>>> neither QEMUMachine::init nor QEMUMachine::reset callbacks has information
>>>> about whether we are about to receive a migration or not (-incoming
>>>> parameter) and we cannot move device-tree and system firmware
>>>> initialization anywhere else.
>>>> ram_bulk_stage is static and cannot be disabled from the platform
>>>> initialization code.
>>>> So what would the community suggest?
>>> Revert the patch. :)
>> I'll wait for 24 hours (forgot to cc: the author) and then post a revert
>> patch :)
> does this problem only occur on pseries emulation?

No idea, really.

> not sending zero pages is not only a performance benefit it also makes
> overcomitted memory usable. the madv_dontneed seems to kick in asynchronously
> and memory is not available immediately.

Ok, I do not mind :)

> what I do not understand if the a memory region is not empty at destination
> due to device tree, firmware etc. it shouldn't be empty at the source as well 
> so
> in theory this should not be a problem.

This is how it works - first QEMU allocates RAM and put devicetree+firmware
somewhere. Then QEMU starts the guest so the firmware starts, loads the
kernel and then the kernel zeroes the whole (most of?) RAM including the
area where the firmware used to be.

Now we migrate. If the source guest is in the kernel already, then it does
not know about the memory area previously occupied by the firmware, it is
just an empty page. If the source guest is still in the firmware, then
those pages are not empty and they are perfectly migrated.


reply via email to

[Prev in Thread] Current Thread [Next in Thread]