qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH 00/46] Postcopy implementation


From: Dr. David Alan Gilbert
Subject: Re: [Qemu-devel] [PATCH 00/46] Postcopy implementation
Date: Thu, 10 Jul 2014 16:49:44 +0100
User-agent: Mutt/1.5.23 (2014-03-12)

* Andrea Arcangeli (address@hidden) wrote:
> On Thu, Jul 10, 2014 at 02:37:43PM +0100, Dr. David Alan Gilbert wrote:
> > * Eric Blake (address@hidden) wrote:
> > > Is there any need for an
> > > event telling libvirt that enough pre-copy has occurred to make a
> > > postcopy worthwhile?
> > 
> > I'm not sure that qemu knows much more than management does at that
> > point; any such decision you can make based on an arbitrary cut off
> > (i.e. migration is taking too long) or you could consider something
> > based on some of the other stats that migration already exposes
> > (like the dirty pages stats); if we've got any more stats that you
> > need we can always expose them.
> >
> > Agreed; although we can just do that independently of this big patch set.
> 
> It can be independent yes, but I think such event is needed (and once
> we add such event I hope we can get rid of the polling libvirt is
> doing for pure precopy too).
> 
> I think for very large guests what should happen is a single _lazy_
> pass of precopy and then immediately postcopy.
> 
> That's why I think an event that notifies libvirt when it should issue
> the postcopy command is good, to be able to implement the single
> _lazy_ pass and nothing more than that.
> 
> qemu should stop precopy and the source guest just before sending the
> event, so then libvirt can assign all storage to the destination just
> before issuing the postcopy commmand. By the time the event has been
> raised by qemu, the guest in the source qemu must never run
> anymore. So it is actually the same event needed in pure precopy too
> (except when using precopy+postcopy the "precopy complete" event will
> fire much sooner). We'll still need a parameter to precopy to tell
> qemu when precopy should stop.

That's an interesting different type of event; I think we probably
have that first pass information but it's not part of the 'state'
(i.e. whether it's started/completed/cancelled enum).

> The single precopy lazy pass would consist of clearing the dirty
> bitmap, starting precopy, then if any page is found dirty by the time
> precopy tries to send it, we skip it. We only send those pages in
> precopy that haven't been modified yet by the time we reach them in
> precopy.
> 
> Pages heavily modified will be sent purely through
> postcopy. Ultimately postcopy will be a page sorting feature to
> massively decrease the downtime latency, and to reduce to 2*ramsize
> the maximum amount of data transferred on the network without having
> to slow down the guest artificially. We'll also know exactly the
> maximum time in advance that it takes to migrate a large host no
> matter the load in it (2*ramsize divided by the network bandwidth
> available at the migration time). It'll be totally deterministic, no
> black magic slowdowns anymore.

There is a trade off;  killing the precopy does reduce network bandwidth,
but the other side of it is that you would incur more postcopy round trips,
so your average latency will probably increase.

Dave
--
Dr. David Alan Gilbert / address@hidden / Manchester, UK



reply via email to

[Prev in Thread] Current Thread [Next in Thread]