qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Adding a handshake to qemu-guest-agent


From: Michael Roth
Subject: Re: Adding a handshake to qemu-guest-agent
Date: Wed, 16 Feb 2022 14:51:44 -0600

On Wed, Feb 16, 2022 at 10:12:36AM +0100, Markus Armbruster wrote:
> Michael Roth <michael.roth@amd.com> writes:
> 
> > On Mon, Feb 14, 2022 at 03:14:37PM +0100, Markus Armbruster wrote:
> >> Cc: the qemu-ga maintainer
> >> 
> >> John Snow <jsnow@redhat.com> writes:
> >> 
> >> > [Moving our discussion upstream, because it stopped being brief and 
> >> > simple.]
> >
> > Hi John, Markus,
> >
> >> 
> >> Motivation: qemu-ga doesn't do capability negotiation as specified in
> >> docs/interop/qmp-spec.txt.
> >> 
> >> Reminder: qmp-spec.txt specifies the server shall send a greeting
> >> containing the capabilities on offer.  The client shall send a
> >> qmp_capabilities command before any other command.
> >> 
> >> We can't just fix qemu-ga to comply, because it would break existing
> >> clients.
> >> 
> >> We could document its behavior in qmp-spec.txt.  Easy enough, but also
> >> kind of sad.
> >
> > I'm not sure we could've ever done it QMP-style with the initial
> > greeting/negotiation mode. It's been a while, I but recall virtio-serial
> > chardev in guest not having a very straight-forward way of flushing out
> > data from the vring after a new client connects on the host side, so
> > new clients had a chance of reading left-over garbage from previous
> > client sessions. Or maybe it was open/close/open on the guest/chardev
> > side that didn't cause the flush... anyway:
> >
> > This is why guest-sync was there, so you could verify the stream was
> > in sync with a given "session ID" before continuing. But that doesn't
> > help much if the stream is in some garbage state that parser can't
> > recover from...
> >
> > This is why guest-sync-delimited was introduced; it inserts a 0xFF
> > sential value (invalid for any normal QMP stream) prior to response that
> > a client can scan for to flush the stream. Similarly, clients are
> > supposed to precede guest-sync/guest-sync-delimited so QGA to get stuck
> > trying to parse a partial read from an earlier client that is 'eating' a
> > new request from a new client connection. I don't think these are really
> > issues with vsock (or the other transports QGA accepts), but AFAIK
> > Windows is still mostly reliant on virtio-serial, so these are probably
> > still needed.
> 
> I believe you're right about the reason being virtio-serial.  I
> documented it that way in commit 72e9e569d0 "docs/interop/qmp-spec: How
> to force known good parser state".
> 
>     2.6 Forcing the JSON parser into known-good state
>     -------------------------------------------------
> 
>     Incomplete or invalid input can leave the server's JSON parser in a
>     state where it can't parse additional commands.  To get it back into
>     known-good state, the client should provoke a lexical error.
> 
>     The cleanest way to do that is sending an ASCII control character
>     other than '\t' (horizontal tab), '\r' (carriage return), or '\n' (new
>     line).
> 
>     Sadly, older versions of QEMU can fail to flag this as an error.  If a
>     client needs to deal with them, it should send a 0xFF byte.
> 
>     2.7 QGA Synchronization
>     -----------------------
> 
>     When a client connects to QGA over a transport lacking proper
>     connection semantics such as virtio-serial, QGA may have read partial
>     input from a previous client.  The client needs to force QGA's parser
>     into known-good state using the previous section's technique.
>     Moreover, the client may receive output a previous client didn't read.
>     To help with skipping that output, QGA provides the
>     'guest-sync-delimited' command.  Refer to its documentation for
>     details.
> 
> 0xFF is invalid UTF-8, which is kind of icky.  We should've used a
> proper control character like EOT (end of transmission) from the start.
> Water under the bridge.
> 
> guest-sync has another design flaw: an unread command reply consisting
> of just an integer can be confused with guest-sync's reply.  Unlikely as
> long as guest-sync's @id argument is chosen at random, as its
> documentation demands.
> 
> guest-sync could be deprecated, I guess.  

Yes, should probably be deprecated in favor of guest-sync-delimited. I
left it for clients that really don't want to dig into the transport
layer to search for 0xFF, but still want at least some ability to
re-sync.

> 
> The @id argument of guest-sync and guest-sync-delimited feels kind of
> redundant with the command object's @id member.  Except QGA didn't
> conform to the QMP spec until commit 4eaca8de26 "qmp: common 'id'
> handling & make QGA conform to QMP spec" (v4.0.0).  More water under the
> bridge.
> 
> Note that there's no need for all this when the transport provides
> proper connection semantics.  Clients relying on connection semantics
> work fine even when they don't bother with this syncing stuff.  Do such
> clients exist?  We probably don't know.  May or may not matter.

True, I think only virtio-serial and maybe isa-serial require the sync.
I was hoping virtio-vsock might quickly become the default, then most
clients would no longer need guest-sync*, but Windows still seems to be
reliant on virtio-serial (AFAIK).

> 
> > So QGA has sort of always had its own hand-shake, ideally via
> > guest-sync-delimited. So if this new negotiation mechanism could
> > build off of that, rather than introducing something on top, that would
> > be ideal. Unfortunately it's naming isn't great for what's being done
> > here, but 'synchronize' is sorta in the ball-park at least...
> 
> Fair point.
> 
> >> Is there a way to add capability negotiation to qemu-ga without breaking
> >> existing clients?  We obviously have to make it optional.
> >> 
> >> The obvious idea "make qmp_capabilities optional" doesn't work, because
> >> the client needs to receive the greeting before sending
> >> qmp_capabilities, to learn what capabilities are on offer.
> >> 
> >> This leads to...
> >> 
> >> > What about something like this:
> >> >
> >> > Add a new "request-negotiation" command to qemu-guest-agent 7.0.0.
> >> >
> >> > [Modern client to unknown server]
> >> > 1. A modern client connects to a server of unknown version, and
> >> > without waiting, issues the "request-negotiation" command.
> >> > 2. An old server will reply with CommandNotFound. We are done 
> >> > negotiating.
> >> > 3. A modern server will reply with the greeting in the traditional
> >> > format, but as a reply object (to preserve "execute" semantics.)
> >> > 4. The modern client will now issue qmp-capabilities as normal.
> >> > 5. The server replies with success or failure as normal.
> >> > 6. Connection is fully established.
> >> >
> >> > [Old client to unknown server]
> >> > 1. An old client connects to an unknown version server.
> >> > 2. A command is issued some time later.
> >> >   2a. The server is old, the command worked as anticipated.
> >> >   2b. The server is new, the command fails with CommandNotFound and
> >> > urges the use of 'request-negotiation'.
> >> 
> >> A new server could accept the command, too.  This way, negotiation
> >> remains optional, unlike in "normal" QMP.  Old clients don't negotiate,
> >> and get default capabilities.
> >
> > ...so what if we had guest-sync/guest-sync-delimited start returning
> > capabilities and version fields rather than a new request-negotiation
> > command? If they aren't present we know the server is too old, and don't
> > have to rely on CommandNotFound to determine that.
> 
> Both guest-sync and guest-sync-delimited have a design flaw that defeats
> such a straighforward extension: 'returns': 'int'.  There is no way to
> return more data compatibly.

Ugh, I misread the scheme and thought it was a struct with a single
field... I knew that seemed to good to be true.

> 
> We could add an optional flag to guest-sync-delimited to make it return
> an object.  But I think we'd be better off with a new command.

New cmd is probably for the best then, since hopefully guest-sync-delimited
can become irrelevant in the future, and the possibility of a failed sync
command on old clients (due to new param) getting mixed in with the recovery
logic for a failed negotiation/capability probe, would probably make the
whole interface a bit too unwieldy.

> >> > - QGA is now officially on a different flavor of QMP protocol. You
> >> > still need to know in advance if you are connecting to QGA or anything
> >> > else. That's still a little sad, but maybe that's just simply an
> >> > impossible goal.
> >> >
> >> > Bonus:
> >> > - If an execution ID is used when sending "request-negotiation", we
> >> > know that the server is at least version 4.0.0 if it responds to us
> >> > using that ID. A modern client can then easily distinguish between
> >> > pre-4.0, post-4.0 and post-7.0 servers. It's a useful probe.
> >
> > Is that in reference to the optional 'id' field that can be passed to
> > QMP requests? Don't see the harm in that, and QGA should pass them back
> > intact.
> 
> I think it does since commit 4eaca8de26 "qmp: common 'id' handling &
> make QGA conform to QMP spec" (v4.0.0).

Ah, right, sort of remember now. Thanks for the pointer.

-Mike



reply via email to

[Prev in Thread] Current Thread [Next in Thread]