[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Adding a handshake to qemu-guest-agent
From: |
Michael Roth |
Subject: |
Re: Adding a handshake to qemu-guest-agent |
Date: |
Wed, 16 Feb 2022 14:51:44 -0600 |
On Wed, Feb 16, 2022 at 10:12:36AM +0100, Markus Armbruster wrote:
> Michael Roth <michael.roth@amd.com> writes:
>
> > On Mon, Feb 14, 2022 at 03:14:37PM +0100, Markus Armbruster wrote:
> >> Cc: the qemu-ga maintainer
> >>
> >> John Snow <jsnow@redhat.com> writes:
> >>
> >> > [Moving our discussion upstream, because it stopped being brief and
> >> > simple.]
> >
> > Hi John, Markus,
> >
> >>
> >> Motivation: qemu-ga doesn't do capability negotiation as specified in
> >> docs/interop/qmp-spec.txt.
> >>
> >> Reminder: qmp-spec.txt specifies the server shall send a greeting
> >> containing the capabilities on offer. The client shall send a
> >> qmp_capabilities command before any other command.
> >>
> >> We can't just fix qemu-ga to comply, because it would break existing
> >> clients.
> >>
> >> We could document its behavior in qmp-spec.txt. Easy enough, but also
> >> kind of sad.
> >
> > I'm not sure we could've ever done it QMP-style with the initial
> > greeting/negotiation mode. It's been a while, I but recall virtio-serial
> > chardev in guest not having a very straight-forward way of flushing out
> > data from the vring after a new client connects on the host side, so
> > new clients had a chance of reading left-over garbage from previous
> > client sessions. Or maybe it was open/close/open on the guest/chardev
> > side that didn't cause the flush... anyway:
> >
> > This is why guest-sync was there, so you could verify the stream was
> > in sync with a given "session ID" before continuing. But that doesn't
> > help much if the stream is in some garbage state that parser can't
> > recover from...
> >
> > This is why guest-sync-delimited was introduced; it inserts a 0xFF
> > sential value (invalid for any normal QMP stream) prior to response that
> > a client can scan for to flush the stream. Similarly, clients are
> > supposed to precede guest-sync/guest-sync-delimited so QGA to get stuck
> > trying to parse a partial read from an earlier client that is 'eating' a
> > new request from a new client connection. I don't think these are really
> > issues with vsock (or the other transports QGA accepts), but AFAIK
> > Windows is still mostly reliant on virtio-serial, so these are probably
> > still needed.
>
> I believe you're right about the reason being virtio-serial. I
> documented it that way in commit 72e9e569d0 "docs/interop/qmp-spec: How
> to force known good parser state".
>
> 2.6 Forcing the JSON parser into known-good state
> -------------------------------------------------
>
> Incomplete or invalid input can leave the server's JSON parser in a
> state where it can't parse additional commands. To get it back into
> known-good state, the client should provoke a lexical error.
>
> The cleanest way to do that is sending an ASCII control character
> other than '\t' (horizontal tab), '\r' (carriage return), or '\n' (new
> line).
>
> Sadly, older versions of QEMU can fail to flag this as an error. If a
> client needs to deal with them, it should send a 0xFF byte.
>
> 2.7 QGA Synchronization
> -----------------------
>
> When a client connects to QGA over a transport lacking proper
> connection semantics such as virtio-serial, QGA may have read partial
> input from a previous client. The client needs to force QGA's parser
> into known-good state using the previous section's technique.
> Moreover, the client may receive output a previous client didn't read.
> To help with skipping that output, QGA provides the
> 'guest-sync-delimited' command. Refer to its documentation for
> details.
>
> 0xFF is invalid UTF-8, which is kind of icky. We should've used a
> proper control character like EOT (end of transmission) from the start.
> Water under the bridge.
>
> guest-sync has another design flaw: an unread command reply consisting
> of just an integer can be confused with guest-sync's reply. Unlikely as
> long as guest-sync's @id argument is chosen at random, as its
> documentation demands.
>
> guest-sync could be deprecated, I guess.
Yes, should probably be deprecated in favor of guest-sync-delimited. I
left it for clients that really don't want to dig into the transport
layer to search for 0xFF, but still want at least some ability to
re-sync.
>
> The @id argument of guest-sync and guest-sync-delimited feels kind of
> redundant with the command object's @id member. Except QGA didn't
> conform to the QMP spec until commit 4eaca8de26 "qmp: common 'id'
> handling & make QGA conform to QMP spec" (v4.0.0). More water under the
> bridge.
>
> Note that there's no need for all this when the transport provides
> proper connection semantics. Clients relying on connection semantics
> work fine even when they don't bother with this syncing stuff. Do such
> clients exist? We probably don't know. May or may not matter.
True, I think only virtio-serial and maybe isa-serial require the sync.
I was hoping virtio-vsock might quickly become the default, then most
clients would no longer need guest-sync*, but Windows still seems to be
reliant on virtio-serial (AFAIK).
>
> > So QGA has sort of always had its own hand-shake, ideally via
> > guest-sync-delimited. So if this new negotiation mechanism could
> > build off of that, rather than introducing something on top, that would
> > be ideal. Unfortunately it's naming isn't great for what's being done
> > here, but 'synchronize' is sorta in the ball-park at least...
>
> Fair point.
>
> >> Is there a way to add capability negotiation to qemu-ga without breaking
> >> existing clients? We obviously have to make it optional.
> >>
> >> The obvious idea "make qmp_capabilities optional" doesn't work, because
> >> the client needs to receive the greeting before sending
> >> qmp_capabilities, to learn what capabilities are on offer.
> >>
> >> This leads to...
> >>
> >> > What about something like this:
> >> >
> >> > Add a new "request-negotiation" command to qemu-guest-agent 7.0.0.
> >> >
> >> > [Modern client to unknown server]
> >> > 1. A modern client connects to a server of unknown version, and
> >> > without waiting, issues the "request-negotiation" command.
> >> > 2. An old server will reply with CommandNotFound. We are done
> >> > negotiating.
> >> > 3. A modern server will reply with the greeting in the traditional
> >> > format, but as a reply object (to preserve "execute" semantics.)
> >> > 4. The modern client will now issue qmp-capabilities as normal.
> >> > 5. The server replies with success or failure as normal.
> >> > 6. Connection is fully established.
> >> >
> >> > [Old client to unknown server]
> >> > 1. An old client connects to an unknown version server.
> >> > 2. A command is issued some time later.
> >> > 2a. The server is old, the command worked as anticipated.
> >> > 2b. The server is new, the command fails with CommandNotFound and
> >> > urges the use of 'request-negotiation'.
> >>
> >> A new server could accept the command, too. This way, negotiation
> >> remains optional, unlike in "normal" QMP. Old clients don't negotiate,
> >> and get default capabilities.
> >
> > ...so what if we had guest-sync/guest-sync-delimited start returning
> > capabilities and version fields rather than a new request-negotiation
> > command? If they aren't present we know the server is too old, and don't
> > have to rely on CommandNotFound to determine that.
>
> Both guest-sync and guest-sync-delimited have a design flaw that defeats
> such a straighforward extension: 'returns': 'int'. There is no way to
> return more data compatibly.
Ugh, I misread the scheme and thought it was a struct with a single
field... I knew that seemed to good to be true.
>
> We could add an optional flag to guest-sync-delimited to make it return
> an object. But I think we'd be better off with a new command.
New cmd is probably for the best then, since hopefully guest-sync-delimited
can become irrelevant in the future, and the possibility of a failed sync
command on old clients (due to new param) getting mixed in with the recovery
logic for a failed negotiation/capability probe, would probably make the
whole interface a bit too unwieldy.
> >> > - QGA is now officially on a different flavor of QMP protocol. You
> >> > still need to know in advance if you are connecting to QGA or anything
> >> > else. That's still a little sad, but maybe that's just simply an
> >> > impossible goal.
> >> >
> >> > Bonus:
> >> > - If an execution ID is used when sending "request-negotiation", we
> >> > know that the server is at least version 4.0.0 if it responds to us
> >> > using that ID. A modern client can then easily distinguish between
> >> > pre-4.0, post-4.0 and post-7.0 servers. It's a useful probe.
> >
> > Is that in reference to the optional 'id' field that can be passed to
> > QMP requests? Don't see the harm in that, and QGA should pass them back
> > intact.
>
> I think it does since commit 4eaca8de26 "qmp: common 'id' handling &
> make QGA conform to QMP spec" (v4.0.0).
Ah, right, sort of remember now. Thanks for the pointer.
-Mike