[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] [RFC Design Doc]Speed up live migration by skipping fre
From: |
Dr. David Alan Gilbert |
Subject: |
Re: [Qemu-devel] [RFC Design Doc]Speed up live migration by skipping free pages |
Date: |
Wed, 20 Apr 2016 09:10:34 +0100 |
User-agent: |
Mutt/1.5.24 (2015-08-30) |
* Li, Liang Z (address@hidden) wrote:
> > Subject: Re: [RFC Design Doc]Speed up live migration by skipping free pages
> >
> > * Li, Liang Z (address@hidden) wrote:
> > > Hi Dave,
> > >
> > > I am now working on how to benefit post-copy by skipping the free
> > > pages, and I remember you have said we should let the destination know
> > > the info of free pages so as to avoid request the free pages from the
> > source.
> > >
> > > We have two solutions:
> > >
> > > a. send the migration dirty page bitmap to destination before post
> > > copy start, so the destination can decide whether to request the pages
> > > or place zero pages by checking the migration dirty page bitmap. The
> > > advantage is that we can avoid sending the free pages. the
> > > disadvantage is that we have to send extra data to destination.
> > >
> > > b. Check the page request on the source side, if it's not a dirty
> > > page, send a zero page header to the destination.
> > >
> > > What's your opinion about them?
> >
> > (b) is certainly simpler - and requires no changes on the destination side
> > or
> > the protocol.
> > If you then decided to add stuff to send the dirty page bit map later you
> > could do.
> >
> > However, there are some other problems to figure out:
> > 1) The source side quits when it thinks it's sent all pages; when is your
> > source going to quit? If it quits while the destination still has
> > unfulfilled pages then the destination will fail.
>
> The source quit as the same as before, but before quitting, tell destination
> it has already quit.
> After that, the destination don't need to request pages from the source, just
> place zero pages. works?
Yes, maybe. The destination side would somehow have to clean up once it has all
the zero pages, but it currently doesn't keep a count or map of which pages
still need to be received.
Actually, perhaps that's easy - when the destination receives the 'quit it's
zero'
message from the source, maybe it just turns off userfault; any fresh accesses
would get a zero page. However, I'm not sure what happens to pages that are
already blocked/waiting for a page - that we'd need to check with Andrea/test.
> > 2) I sent a 'discard' bitmap of pages for the destination to unmap
> > just at the change into postcopy; so I'm already sending one bitmap;
> > this is for pages that have been changed since they were first sent
> > but not yet resent.
> > Be careful about how any changes you make interact with the generation
> > of that bitmap.
>
> Thanks for your reminding.
>
> > 3) It's potentially very slow if the destination has to keep requesting
> > blank pages.
>
> Yes, really.
>
> > Essentially what you're suggesting for (a) is a way to send a compressed set
> > of 'page is zero' messages based on a bitmap, and you're worried about the
> > time to send it - which I think is where we started the conversation about
> > time to deal with zeros :-). Two ways to think of that are:
>
> All my thoughts are in your words. :)
>
> > 4) I already send one bitmap - so you're only doubling it in theory;
> > I originally used a sparse bitmap but the suggestion was it was
> > more complex than needed and it turned into more of a run-length
> > encoding.
> > 5) You're worried it would increase the downtime as you send the bitmap;
> > however
> > if you implement (b) as well as (a) then you can send the data for
> > (a) after the destination is running and not increase the downtime.
>
> The downtime is main reason that I start to consider about (b), for VM with
> huge amount of RAM.
> the downtime will become a big problem. Obviously, (a) is more efficient
> then (b).
With your idea about sending a 'quit' message to tell the destination the
remaining
pages are all zero, I'm not sure that's true - (b) + the quit message sounds
like
a good combination.
Dave
>
>
> > Dave
> >
> > --
> > Dr. David Alan Gilbert / address@hidden / Manchester, UK
--
Dr. David Alan Gilbert / address@hidden / Manchester, UK