qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH 18/20] migration: Postcopy preemption enablement


From: Peter Xu
Subject: Re: [PATCH 18/20] migration: Postcopy preemption enablement
Date: Wed, 23 Feb 2022 21:05:34 +0800

On Wed, Feb 23, 2022 at 09:56:08AM +0000, Dr. David Alan Gilbert wrote:
> * Peter Xu (peterx@redhat.com) wrote:
> > On Tue, Feb 22, 2022 at 10:52:23AM +0000, Dr. David Alan Gilbert wrote:
> > > This does get a bit complicated, which worries me a bit; the code here
> > > is already quite complicated.
> > 
> > Right, it's the way I chose in this patchset on solving this problem.  Not
> > sure whether there's any better and easier way.
> > 
> > For example, we could have used a new thread to send requested pages, and
> > synchronize it with the main thread.  But that'll need other kind of
> > complexity, and I can't quickly tell whether that'll be better.
> > 
> > For this single patch, more than half of the complexity comes from the
> > ability to interrupt sending one huge page half-way.  It's a bit of a pity
> > that, that part will be noop ultimately when with doublemap.
> 
> How does that huge-page interruption interact with recovery?
> i.e. do we know the start of that hugepage arrived?

That's a great question..  I should have mentioned that but I forgot.

When postcopy is interrupted during sending a huge page, the dest QEMU will
not be able to do the UFFDIO_COPY of that huge page (because it lacks
data!) then it also means the received bitmap of that huge page will be
completely cleared.

So when recover happens, the dest QEMU will tell the source about this fact
("Hey this huge page has never transferred", even if it actually has
transferred a few small pages already!).  Then the whole huge page will be
resent.

When postcopy preempt joins the equation, what we need to do is to reset
the temp huge pages (postcopy_pause_incoming()):

    /*
     * If network is interrupted, any temp page we received will be useless
     * because we didn't mark them as "received" in receivedmap.  After a
     * proper recovery later (which will sync src dirty bitmap with receivedmap
     * on dest) these cached small pages will be resent again.
     */
    for (i = 0; i < mis->postcopy_channels; i++) {
        postcopy_temp_page_reset(&mis->postcopy_tmp_pages[i]);
    }

This chunk of code lies in "migration: Introduce postcopy channels on dest
node" but not in the recovery patch, I think that's the major reason why
it's easily overlooked.  However it needs to be there to not break existing
postcopy.

So that's kind of hidden in the past because we don't manage the temp huge
pages explicitly (they used to be local vars, so get reset automatically),
but now we need to do that by hand.

> 
> > 
> > However I kept those only because we don't know when doublemap will be
> > ready, not to say, landing.  Meanwhile we can't assume all kernels will
> > have doublemap even in the future.
> 
> Yeh, if doublemap was already here you could make it a condition of
> allowing you to set the option.

Right.  We'll 100% skip the huge page interruption, just like when the
ramblock is using PAGE_SIZE small pages.

-- 
Peter Xu




reply via email to

[Prev in Thread] Current Thread [Next in Thread]