qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: QMP and the 'id' parameter


From: John Snow
Subject: Re: QMP and the 'id' parameter
Date: Fri, 20 Nov 2020 11:49:40 -0500
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.3.1

On 11/20/20 5:25 AM, Markus Armbruster wrote:
John Snow <jsnow@redhat.com> writes:
On 11/11/20 3:27 AM, Markus Armbruster wrote:
John Snow <jsnow@redhat.com> writes:
On 11/10/20 1:22 AM, Markus Armbruster wrote:
John Snow <jsnow@redhat.com> writes:


If you find yourself in a situation where an error class would
definitely be useful, we should talk about providing you one.


OK, thanks. I just wanted to check there wasn't a stronger opposition than I was aware of. I only vaguely knew they were "on the way out."

QGA has a special case, but you probably don't want to know.

I do want to know, unfortunately. A good QMP client should likely
support both targets.

QGA has a few commands that send *nothing* on success.

qapi-code-gen.txt explains:

     Normally, the QAPI schema is used to describe synchronous exchanges,
     where a response is expected.  But in some cases, the action of a
     command is expected to change state in a way that a successful
     response is not possible (although the command will still return an
     error object on failure).  When a successful reply is not possible,
     the command definition includes the optional member 'success-response'
     with boolean value false.  So far, only QGA makes use of this member.

qmp-spec.txt neglects to cover this special case.


Oh, I see. That's fun!

(Was it not possible to have the client send an ACK response that doesn't indicate success, but just signals receipt? Semantically, it would be the difference between do-x and start-x as a command.)

Even though counting responses is a workable way to map responses back
to requests, I recommend to either have at most one request in flight
(so no counting is necessary), or to add IDs to all requests (so
counting isn't necessary as long as you get the syntax right enough to
avoid the ID-less errors discussed above).

I am taking this approach. I still need some error handling for the
case that an incoming response cannot be routed to the correct pending
queue.

The QMP spec states that if I get an ID and I don't recognize it, I am
free to drop the response on the floor (So I have done so),

Should not happen.  When it does, something is wrong, possibly seriously
wrong.


Yep, I agree.

(In my client library: I have decided to enforce string IDs and not allow the caller to specify their own. Though we allow arbitrary objects to be used as an ID, being able to compare more complex objects seems more prone to failure than a simple string.)

One possible scenario: ID got corrupted on the way from client via
server back to the client.  The response with the uncorrupted ID will
never come.  Other responses and events may still arrive fine.

Two recovery strategies:

1. Drop the unexpected response, carry on.  If commands are in flight,
    be prepared for missing responses.

2. Give up, close connection.

Pick the strategy that best fits your client's needs.


Dropping it on the floor seems fine for now, the spec says I can! :)

The most realistic scenario for this outside of catastrophic error is that a client sends a command, but then times out waiting for a response. Later, the response arrives -- but that context has already come and gone, so there's nowhere to route the message to.

Effectively, an orphaned reply. I suppose I could always stash them in an orphaned queue and a very, very rigorous client could check the orphaned queue if something winds up in there, but I suspect most clients would actually be just as happy to let it fall on the floor.

                                                             but I need
to handle the case that I see a response with no ID at all.

Should not happen as long as the client sends syntactically sane input.
When it does, something is wrong, possibly seriously wrong.

Perhaps the easiest, and most sensible thing to do, is to declare this
a catastrophic failure and move the QMP connection into an error
state, invalidate the whole queue, and disconnect.

(And then maybe some of the commands you issued got processed, maybe
they didn't. I guess it's up to the application driving the library to
query around to find out what happened.)

Again, two recovery strategies:

1. Drop the unexpected response, carry on.  If commands are in flight,
    be prepared for missing responses.  The server's JSON parser may now
    be in a state where it can't parse additional commands.  You may want
    to provoke a lexical error to force it into known-good state, as
    explained below.

2. Give up, close connection.


2 is easier for now. 1 involves some more advanced recovery logic, but as you say: something is wrong, possibly seriously wrong.

I see. So OOB commands don't execute in parallel, and when they "jump
the queue", further executions are not received. Right?

Right.  It's "out of band", not "in parallel".


OK. Naively I thought that new OOB commands would continue to be processed "out of band", but it's fine if that capacity is finite.

Conceptually, then: There's one input queue and two pending execution
slots. One slot is for OOB commands, and the other slot is for
traditional commands.

The input queue is FIFO, and when we pull a request from the queue, it
occupies one of the conceptual slots (normal/oob). If the slot for
this command type is already occupied, we block until it's free.

Is that the correct mental model? (Even if it's not really even
vaguely correct, details-wise.)

Almost.

Consider

     - Send execute A
     - Send execute B
     - Send exec-oob C
     - Send execute D


[...]

The I/O thread receives in-band A, and puts it into the empty in-band
queue.

The I/O thread receives in-band B, and puts it into the non-full in-band
queue.

The I/O thread receives out-of-band C, and exectutes it right away.


Ah, I see where I went wrong. OK, it's crystal clear now.

The I/O thread receives in-band C, and puts it into the non-full in-band
queue.

Meanwhile, the main thread takes in-band A out of the queue, and starts
executing it.  A takes its own sweet time.

Only if the client has more in-band commands in flight than the in-band
queue can hold, the I/O thread encounters a full queue and suspends
reading.

qmp-spec.txt:

     If the client sends in-band commands faster than the server can
     execute them, the server will stop reading the requests from the QMP
     channel until the request queue length is reduced to an acceptable
     range.

The queue length should probably be specified here.  It's 8.


Oh! That's very helpful to know, actually. If you write a patch to amend the language of the doc, I will give you my R-B.

The server limits the in-band request queue length for the same
robustness reasons it limits JSON token length, token count and nesting
depth: to not let a malfunctioning client make it consume unlimited
amounts of memory.  Hole (kind of): it doesn't limit output queue
length.


Understood!

Is there a reason to not always use \xFF ?

It's invalid UTF-8.

"Older versions" should probably read "ancient versions" by now.  I
don't remember offhand which commit fixed the JSON parser to do the
right thing in all states.


I suppose right now I don't have any code that tries to unjam the parser, so I suppose it doesn't matter.

If it did, I guess I would just try to assume that I was allowed to use the newer method and too-bad to folks trying to talk to ancient stuff.

Questions?
A few :)
More?
None of tremendous import, I think, by now. Maybe the GQA stuff, but
that can be later.
Still more?

Where does matter come from? If energy cannot be created or destroyed, how did it come to be?




reply via email to

[Prev in Thread] Current Thread [Next in Thread]