qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] Stalls on Live Migration of VMs with a lot of memory


From: Peter Lieven
Subject: Re: [Qemu-devel] Stalls on Live Migration of VMs with a lot of memory
Date: Wed, 04 Jan 2012 12:42:10 +0100
User-agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.2.23) Gecko/20110921 Thunderbird/3.1.15

On 04.01.2012 12:28, Paolo Bonzini wrote:
On 01/04/2012 12:22 PM, Peter Lieven wrote:
There were patches to move RAM migration to a separate thread. The
problem is that they broke block migration.

However, asynchronous NBD is in and streaming will follow suit soon.
As soon as we have those two features, we might as well remove the
block migration code.

ok, so its a matter of time, right?

Well, there are other solutions of varying complexity in the works, that might remove the need for the migration thread or at least reduce the problem (post-copy migration, XBRLE, vectorized hot loops). But yes, we are aware of the problem and we should solve it in one way or the other.
i have read all these approached and they seem all promising.

would it make sense to patch ram_save_block to always process a full ram
block?

If I understand the proposal, then migration would hardly be live anymore. The biggest RAM block in a 32G machine is, well, 32G big. Other RAM blocks are for the VRAM and for some BIOS data, but they are very small in proportion.
ok, then i misunderstood the ram blocks thing. i thought the guest ram would consist of a collection of ram blocks. then let me describe it differntly. would it make sense to process bigger portions of memory (e.g. 1M) in stage 2 to reduce the number of calls to cpu_physical_memory_reset_dirty and instead run it on bigger portions of memory. we might loose a few dirty pages but they will be tracked in the next iteration in stage 2 or in stage 3 at least. what would be necessary is that nobody marks a page dirty while i copy the dirty information for the portion of memory i want to process.

- in stage 3 the vm is stopped, right? so there can't be any more dirty
blocks after scanning the whole memory once?

No, stage 3 is entered when there are very few dirty memory pages remaining. This may happen after scanning the whole memory many times. It may even never happen if migration does not converge because of low bandwidth or too strict downtime requirements.
ok, is there a chance that i lose one final page if it is modified just after i walked over it and i found no other page dirty (so bytes_sent = 0).

Peter

Paolo




reply via email to

[Prev in Thread] Current Thread [Next in Thread]