qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Time to introduce a migration protocol negotiation (Re: [PATCH v2 00


From: Daniel P . Berrangé
Subject: Re: Time to introduce a migration protocol negotiation (Re: [PATCH v2 00/25] migration: Postcopy Preemption)
Date: Tue, 15 Mar 2022 11:05:51 +0000
User-agent: Mutt/2.1.5 (2021-12-30)

On Tue, Mar 15, 2022 at 10:43:45AM +0000, Dr. David Alan Gilbert wrote:
> * Daniel P. Berrangé (berrange@redhat.com) wrote:
> > Almost every time we add a new feature to migration, we end up
> > having to define at least one new migration parameter, then wire
> > it up in libvirt, and then the mgmt app too, often needing to
> > ensure it is turn on for both client and server at the same time.
> > 
> > 
> > For some features, requiring an explicit opt-in could make sense,
> > because we don't know for sure that the feature is always a benefit.
> > These are things that can be thought of as workload sensitive
> > tunables.
> > 
> > 
> > For other features though, it feels like we would be better off if
> > we could turn it on by default with no config. These are things
> > that can be thought of as migration infrastructre / transport
> > architectural designs.
> > 
> > 
> > eg it would be nice to be able to use multifd by default for
> > migration. We would still want a tunable to control the number
> > of channels, but we ought to be able to just start with a default
> > number of channels automatically, so the tunable is only needed
> > for special cases.
> 
> Right, I agree in part - but we do need those tunables to exist; we rely
> on being able to turn things on or off, or play with the tunables
> to debug and get performance.  We need libvirt to enumerate the tunables
> from qemu rather than having to add code to libvirt every time.
> They're all in QAPI definitions anyway - libvirt really shouldn't be
> adding code each time.   Then we could have a  virsh migrate --tunable
> rather than having loads of extra options which all have different names
> from qemu's name for the same feature.

Provided tunables are strictl just tunables, that would be viable.
Right now our tunables are a mixture of tunables and low level
data transport architectural knobs.

> > This post-copy is another case.  We should start off knowing
> > we can switch to post-copy at any time. We should further be
> > able to add pre-emption if we find it available. IOW, we should
> > not have required anything more than 'switch to post-copy' to
> > be exposed to mgmtm apps.
> 
> Some of these things are tricky; for example knowing whether or not you
> can do postcopy depends on your exact memory configuration; some of that
> is tricky to probe.

I'm just refering to the postcopy capability that we nneed to
set upfront before starting the migration on both sides.  IIUC
that should be possible for QEMU to automatically figure out,
if it could negotiate with dst QEMU.

Whether we ever switch from precopy to postcopy mode once
running can remain under mgmt app control.

> > Or enabling zero copy on either send or receive side.
> > 
> > Or enabling kernel-TLS offload
> 
> Will kernel-TLS be something you'd want to automatically turn on?
> We don't know yet whether it's a good idea if you don't have hardware
> support.

I'm pretty sure kTLS will always be a benefit, because even without
hardware offload you still benefit from getting the TLS encryption
onto a separate CPU core from QEMU's migration thread. We've measured
this already with NBD and I've no reason to suspect it will differ
for migration. 


> > Now define a protocol handshake. A 5 minute thought experiment
> > starts off with something simple:
> > 
> >    dst -> src:  Greeting Message:
> >                   Magic: "QEMU-MIGRATE"  12 bytes
> >                   Num Versions: 1 byte
> >                   Version list: 1 byte * num versions
> >                   Num features: 4 bytes
> >                   Feature list: string * num features
> > 
> >    src -> dst:  Greeting Reply:
> >                   Magic: "QEMU-MIGRATE" 12 bytes
> >                   Select version: 1 byte
> >                   Num select features: 4 bytes
> >                   Selected features: string * num features   
> > 
> >    .... possibly more src <-> dst messages depending on
> >         features negotiated....
> > 
> >    src -> dst:  start migration
> >  
> >     ...traditional migration stream runs now for the remainder
> >        of this connection ...
> 
> Don't worry about designing the bytes; we already have a command
> structure; we just need to add a MIG_CMD_FEATURES and a 
> MIG_RP_MSG_FEATURES
> (I'm not sure what we need to do for RDMA; or what we do for exec: or
> savevm)

For RDMA there are two options

 - Drop RDMA support (preferred ;-)

 - Use a regular TCP channel for the migration protocol
   handshake todo all the feature negotiation.  Open a
   second channel using RDMA just for the migration payload

Before considering "exec", lets think about "fd" as that's more
critical.

How can be get an arbitrary number of bi-directional channels
open when the user is passing in pre-opened FDs individual and
does not know upfront how many QEMU wants ?

We could have an event that QEMU emits whenever it wants to be
given a new "fd" channel. The mgmt app would watch for that and
pass in more pre-opened FDs in response. Not too difficult

Back to "exec" we have two options

 - Drop exec support, and just let the user spawn the
   program externally and pass in a pre-opened socket
   FDs for talking to it

 - Keep exec and make it use a socketpair instead of
   pipe FDs. Connect the socketpair to both stdin+stdout.
   Exec the program many times if needing many channels.


Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|




reply via email to

[Prev in Thread] Current Thread [Next in Thread]