[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] Migration To-do list
From: |
Isaku Yamahata |
Subject: |
Re: [Qemu-devel] Migration To-do list |
Date: |
Wed, 14 Nov 2012 11:23:18 +0900 |
User-agent: |
Mutt/1.5.19 (2009-01-05) |
On Tue, Nov 13, 2012 at 05:46:13PM +0000, Hudzia, Benoit wrote:
> Hi,
>
> One concept we have been playing around in the context of and hybrid and
> post copy and might make sense if you are orienting your effort toward RDMA /
> Post copy is to move most of the logic in the destination side.
>
> This is one thing you might want to consider as it can solve some of the
> issue you currently have and allow you to maintain almost a single API /
> Protocol once integrating with post copy approach.
>
> The idea is to drive the migration from the destination side. I.e. The page
> are pulled from the destination and not pushed from the source side.
>
> Ex: current pre-copy :
>
> *extract dirty bitmap ( dirty bitmap extraction can be scheduled or
> triggered by destination)
> * send it to the destination side
> * have the destination iterating over the bitmap ( can do page
> prioritization here)
IIRC last year, you mentioned page prioritization, but didn't this year.
Is it still supported?
Where is it implemented? in qemu or kernel?
> * depending of protocol :
> _ with standard socket ( or RDS) :
> . Destination : request page(s)<- can be batched
> . source receive request send back the page
> . destination process
> _ with RDMA :
> . Destination Read Page from source to local page ( the
> page have been mapped to RDMA at the bitmap extraction) ( RDMA support
> scatter gather)
Although I'm not familiar with RDMA, RDMA requires the exchange of DMA-address
between
sender and receiver in advance and pinning down pages.
It it correct?
> _ with post copy
> . pretty much the same but the dirty bitmap reset is
> done in kernel during the post copy operation ( provide a better dirty bit
> tracking granularity)
>
>
> Disadvantage:
> * add a round trip that can be compensate with batch operation ( only
> with standard socket)
>
> Advantage :
> * most of the heavy lifting is done at the destination side leaving the
> source to respond to request in an event based format
> * resolve a lot of issue you have with your threading form the sender
> side ( accounting etc.. )
> * extremely friendly to optimised solution
> * if the bitmap generation is expensive we can overlap their generation
> creating a semi continuous delivery of them guaranteeing an uninterrupted and
> optimised flow. => we decouple the bitmap generation from the send/ receive
> operation.
>
>
>
> Anyway , I will notify you as soon as I have the patch / library available
> for RDMA / postcopy.
>
> Note On the fault tolerance part: this require a lot more heavy code
> optimisation and poking around to guarantee efficient checkpointing. Most of
> the solution we tested so far ( Remus and an old version of kemari) scale
> poorly . Again, an RDMA / post copy solution is kind of necessary when you
> talk about check pointing enterprise class applications.
IIRC Kemari guys evaluated IB case. I'm not sure that it was with RDMA or IPoIB.
thanks,
>
>
> Regards
> Benoit
>
>
>
>
>
> > -----Original Message-----
> > From: Juan Quintela [mailto:address@hidden
> > Sent: 13 November 2012 16:19
> > To: qemu-devel qemu-devel; Orit Wasserman; address@hidden;
> > Hudzia, Benoit; Isaku Yamahata; Michael Roth
> > Subject: Migration ToDo list
> >
> >
> > Hi
> >
> > If you have anything else to put, please add.
> >
> > Migration Thread
> > * Plan is integrate it as one of first thing in December (me)
> > * Remove copies with buffered file (me)
> >
> > Bitmap Optimization
> > * Finish moving to individual bitmaps for migration/vga/code
> > * Make sure we don't copy things around
> > * Shared memory bitmap with kvm?
> > * Move to 2MB pages bitmap and then fine grain?
> >
> > QIDL
> > * Review the patches (me)
> >
> > PostCopy
> > * Review patches?
> > * See what we can already integrate?
> > I remember for last year that we could integrate the 1st third or so
> >
> > RDMA
> > * Send RDMA/tcp/.... library they already have (Benoit)
> > * This is required for postcopy
> > * This can be used for precopy
> >
> > General
> > * Change protocol to:
> > a) being always 16byte aligned (paolo said that is faster)
> > b) do scatter/gather of the pages?
> >
> > Fault Tolerance
> > * That is built on top of migration code, but I have nothing to add.
> >
> > Any more ideas?
> >
> > Later, Juan.
>
--
yamahata