qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] When does live migration give up?


From: Alex Bligh
Subject: Re: [Qemu-devel] When does live migration give up?
Date: Wed, 04 Sep 2013 19:05:50 +0100

Paolo,

--On 4 September 2013 19:07:53 +0200 Paolo Bonzini <address@hidden> wrote:

Il 04/09/2013 17:24, Alex Bligh ha scritto:
We have seen a situation when migrating about 50 VMs at once where some
of them fail. I think this is because they are dirtying pages faster than
they can be transmitted.

No, migration never "gives up".  It may never converge, but it keeps
trying until cancelled.

Could it be that you are choosing migration server ports from a small
range, and some of them are failing because two migrations pick the same
random port for the destination (which is where the server socket lies)?

Should not be that. We create FDs (which are sockets) and pass them in at
both ends. Approx 10% of migrations die after many minutes on the
customer's platform. This does not appear to happen if migrations are
not carried out 50 at a time.

We appear to be getting something other than 'ms' returned through the
monitoring system. Unhelpfully what that is is not logged.

Is there anything (apart from the socket closing prematurely) which can
cause a failed migration after many minutes? We've seen problems where
the destination is not set up the same as the source (e.g. different
numbers of NICs) but IIRC that fails much earlier.

To make things easier (cough), this is qemu 1.0 (as shipped with Ubuntu
Precise).

--
Alex Bligh



reply via email to

[Prev in Thread] Current Thread [Next in Thread]