qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [RFC v2 0/8] monitor: allow per-monitor thread


From: Dr. David Alan Gilbert
Subject: Re: [Qemu-devel] [RFC v2 0/8] monitor: allow per-monitor thread
Date: Thu, 7 Sep 2017 10:27:29 +0100
User-agent: Mutt/1.8.3 (2017-05-23)

* Daniel P. Berrange (address@hidden) wrote:
> On Thu, Sep 07, 2017 at 10:19:47AM +0100, Dr. David Alan Gilbert wrote:
> > * Daniel P. Berrange (address@hidden) wrote:
> > > On Thu, Sep 07, 2017 at 04:13:41PM +0800, Peter Xu wrote:
> > > > On Wed, Sep 06, 2017 at 12:54:28PM +0100, Daniel P. Berrange wrote:
> > > > > On Wed, Sep 06, 2017 at 12:31:58PM +0100, Dr. David Alan Gilbert 
> > > > > wrote:
> > > > > > * Daniel P. Berrange (address@hidden) wrote:
> > > > > > > This does imply that you need a separate monitor I/O processing, 
> > > > > > > from the
> > > > > > > command execution thread, but I see no need for all commands to 
> > > > > > > suddenly
> > > > > > > become async. Just allowing interleaved replies is sufficient 
> > > > > > > from the
> > > > > > > POV of the protocol definition. This interleaving is easy to 
> > > > > > > handle from
> > > > > > > the client POV - just requires a unique 'serial' in the request 
> > > > > > > by the
> > > > > > > client, that is copied into the reply by QEMU.
> > > > > > 
> > > > > > OK, so for that we can just take Marc-André's syntax and call it 
> > > > > > 'id':
> > > > > >   
> > > > > > https://lists.gnu.org/archive/html/qemu-devel/2017-01/msg03634.html
> > > > > > 
> > > > > > then it's upto the caller to ensure those id's are unique.
> > > > > 
> > > > > Libvirt has in fact generated a unique 'id' for every monitor command
> > > > > since day 1 of supporting QMP.
> > > > > 
> > > > > > I do worry about two things:
> > > > > >   a) With this the caller doesn't really know which commands could 
> > > > > > be
> > > > > >   in parallel - for example if we've got a recovery command that's
> > > > > >   executed by this non-locking thread that's OK, we expect that
> > > > > >   to be doable in parallel.  If in the future though we do
> > > > > >   what you initially suggested and have a bunch of commands get
> > > > > >   routed to the migration thread (say) then those would suddenly
> > > > > >   operate in parallel with other commands that we're previously
> > > > > >   synchronous.
> > > > > 
> > > > > We could still have an opt-in for async commands. eg default to 
> > > > > executing
> > > > > all commands in the main thread, unless the client issues an explicit
> > > > > "make it async" command, to switch to allowing the migration thread to
> > > > > process it async.
> > > > > 
> > > > >  { "execute": "qmp_allow_async",
> > > > >    "data": { "commands": [
> > > > >        "migrate_cancel",
> > > > >    ] } }
> > > > > 
> > > > > 
> > > > >  { "return": { "commands": [
> > > > >        "migrate_cancel",
> > > > >    ] } }
> > > > > 
> > > > > The server response contains the subset of commands from the request
> > > > > for which async is supported.
> > > > > 
> > > > > That gives good negotiation ability going forward as we incrementally
> > > > > support async on more commands.
> > > > 
> > > > I think this goes back to the discussion on which design we'd like to
> > > > choose.  IMHO the whole async idea plus the per-command-id is indeed
> > > > cleaner and nicer, and I believe that can benefit not only libvirt,
> > > > but also other QMP users.  The problem is, I have no idea how long
> > > > it'll take to let us have such a feature - I believe that will include
> > > > QEMU and Libvirt to both support that.  And it'll be a pity if the
> > > > postcopy recovery cannot work only because we cannot guarantee a
> > > > stable monitor.
> > > 
> > > This is not a blocker for having postcopy recovery feature merged.
> > > It merely means that in a situation where the mainloop is blocked,
> > > then we can't recover, in other situations we'll be able to recover
> > > fine. Sure it would be nice to fix that problem too, but I don't
> > > see it as a block.
> > 
> > It's probably OK to merge the recovery code before the monitor code;
> > but I don't think it's something you'd want to tell users about -
> > a 'postcopy recovery that only works rarely' isn't much use.
> 
> I dunno. Compared to today where there's zero post-copy recovery,
> I think even an incremental improvement is useful. Its a choice
> between "your VM is dead" and "you've a 50/50 chance of life".

There's a chunk of people who wont use postcopy because they regard
it as dangerous; they need something that works in most cases before
they'll use it.

Dave

> Regards,
> Daniel
> -- 
> |: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
> |: https://libvirt.org         -o-            https://fstop138.berrange.com :|
> |: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|
--
Dr. David Alan Gilbert / address@hidden / Manchester, UK



reply via email to

[Prev in Thread] Current Thread [Next in Thread]