Re: QMP and the 'id' parameter

qemu-devel

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: QMP and the 'id' parameter

From:	Markus Armbruster
Subject:	Re: QMP and the 'id' parameter
Date:	Wed, 11 Nov 2020 09:27:33 +0100
User-agent:	Gnus/5.13 (Gnus v5.13) Emacs/27.1 (gnu/linux)

John Snow <jsnow@redhat.com> writes:

> On 11/10/20 1:22 AM, Markus Armbruster wrote:
>> John Snow <jsnow@redhat.com> writes:
>> 
>>> The QMP specification states:
>>>
>>>> NOTE: Some errors can occur before the Server is able to read the "id"
>>>> member, in these cases the "id" member will not be part of the error
>>>> response, even if provided by the client.
>>>
>>> I am assuming this case ONLY occurs for Parse errors:
>>>
>>> {'class': 'GenericError', 'desc': 'JSON parse error, expecting value'}
>> 
>> There are more "desc" possible, actually.
>> 
>> The JSON parser gets fed chunks of input, and calls a callback for every
>> full JSON value, and on parse error.
>> 
>> QMP's callback is handle_qmp_command().  Parameter @req is the parsed
>> JSON value, parameter @err is the (parse) error object, and exactly one
>> of them is non-null.
>> 
>> 1. Parse error
>> 
>> If @err, we send an error response for it.  It never has "id".  See
>> qmp_error_response() caller monitor_qmp_dispatcher_co().  The possible
>> @err are:
>> 
>>      $ grep error_setg qobject/json-*[ch]
>>      qobject/json-parser.c:    error_setg(&ctxt->err, "JSON parse error, 
>> %s", message);
>> 
>> This is a syntax error.
>> 
>> Search for parse_error() to see the possible @message patterns.
>> 
>>      qobject/json-streamer.c:        error_setg(&err, "JSON parse error, 
>> stray '%s'", input->str);
>> 
>> This is a lexical error.
>> 
>>      qobject/json-streamer.c:        error_setg(&err, "JSON token size limit 
>> exceeded");
>>      qobject/json-streamer.c:        error_setg(&err, "JSON token count 
>> limit exceeded");
>>      qobject/json-streamer.c:        error_setg(&err, "JSON nesting depth 
>> limit exceeded");
>> 
>> These are (intentional) parser limits.
>> 
>> 2. Successful parse
>> 
>> If @req, it's a successful parse.
>> 
>> If @req is not a JSON object, there is no "id".  qmp_dispatch() reports
>> 
>>          error_setg(&err, "QMP input must be a JSON object");
>> 
>> If @req is a JSON object, it has "id" exactly when the client supplied
>> one.  The response mirrors @req's "id".  See qmp_error_response() caller
>> qmp_dispatch().
>> 
>
> That's very helpful, thank you. So in summary, the possibilities are:
>
> 1. syntax error (with several descriptions)
> 2. lexical error (with several descriptions, templated from one message)
> 3. lexical limitation (with several descriptions)

Nitpick: parser limitation.  Nesting depth is not lexical :)

> 4. grammatical error ("QMP input must be a JSON object")

We got two layers of syntax: QAPI schema above JSON.  The JSON layer
builds JSON from characters, the QAPI schema layer builds QAPI schema
from JSON.

Errors 1-3 are for the JSON layer.

Error 4 is a QAPI schema syntax error.  The qobject input visitor
reports more of them.  Error 4 is special only because it's the only one
where the JSON cannot have an ID.  All the other QAPI schema syntax
errors can have an ID.

> (I know we have declared the error_class passe and we now prefer 
> "generic_error", but I lack the history lesson on why we do that.

The initial design had "rich" error objects, basically:

* An error QDict with a string "class" member and arbitrary data in a
  QDict "data" member

  To build a QMP error response, add the command's ID (if any) to the
  error QDict, and serialize to JSON.

* A pointer into a table of pairs (JSON template string, human-friendly
  template string).  More on the use of JSON template strings below.

  To build a human-friendly error message, interpolate the members of
  the data QDict into the human-friendly template.

Say you're writing a piece of code.  Among many other things, there are
a few errors to report.  You get to drop what you're doing, and do a
three step dance:

1. Define a QERR_YOUR_NEW_ERROR macro that expands into a JSON template
for the error QDict.  These are all in one place, which is included all
over the place.  Merge conflicts galore.

2. Add a table entry mapping QERR_YOUR_NEW_ERROR to the human-friendly
template.

3. Call the error function, passing QERR_YOUR_NEW_ERROR template and the
arguments.  The arguments must match the template.  This builds the
error QDict from template and the arguments, and looks up the template
in the table.

I strenuously objected to this.  In the end, all I could accomplish was
the inclusion of the human-friendly message in the QMP error response.

I (correctly) predicted two issues:

1. Error classes multiplied.  Not quite like rabbits, but bad enough to
be confusing.  Most of them were undocumented and of absolutely no use
to QMP clients.

2. The three step dance proved bothersome, and people frequently took
the obvious shortcut: instead of defining a new error that fits the
situation, reuse some existing error.  Even when it doesn't quite fit.
Error message quality went into the toilet.

After a few years of this, I went in with an axe, and nobody objected.

Among the stuff I axed were most error classes.  I spared only the ones
that were documented and plausibly useful to some QMP client.

> Wouldn't the error_class here be helpful for interpreting the response? 
> It helps differentiate between 'The RPC call was received' vs 'The RPC 
> call was received, but failed.' which may have semantic differences for 
> who wants to consume the error: the QMP library itself, or the user 
> making the RPC call. No?)

Can we have more error classes?  Yes, if they are documented and
definitely useful to some QMP client that matters.

>>> And I am assuming, in the context of a client that /always/ sets an
>>> 'id' for its execute statements, that this means that any error
>>> response we receive without an 'id' field *must* be associated with
>>> the most-recently-sent command.
>> 
>> Only if the client keeps no more than one command in flight.
>> 
>
> OK, so without engaging the OOB capability, we receive responses 
> strictly in a FIFO manner, always.
>
> So if we have:
>
> Send exec A, Send exec B, Send exec C
>
> and then we receive an error response with no ID attached, we know it 
> must be "A" that errored. Correct?

Correct.

QGA has a special case, but you probably don't want to know.

Even though counting responses is a workable way to map responses back
to requests, I recommend to either have at most one request in flight
(so no counting is necessary), or to add IDs to all requests (so
counting isn't necessary as long as you get the syntax right enough to
avoid the ID-less errors discussed above).

>> Command responses get sent strictly in order (even parse errors), except
>> for commands executed out-of-band.
>> 
>> If the client sends N JSON values, and only then reads responses, and
>> there are no parse errors, then it'll get N responses.  The i-th
>> response is for the i-th JSON value, except responses to OOB command may
>> "jump the queue".
>> 
>
> Let's assume now that A, B, and C are *all* OOB commands:
>
> - Send exec-oob A
> - Send exec-oob B
> - Send exec-oob C
>
> My client now reads an error message that has no ID attached. Which 
> command execution was this associated with?
>
> Is it necessarily "A" if we have not yet received any other response?

In this case, it can only be "A", because exec-oob is executed right
away, which means multiple exec-oob are executed in order, one after the
other.

More interesting example:

  - Send execute A
  - Send exec-oob B

First, consider the "no errors" case.  The first respone may be for A
or, if B managed to jump the queue, for B.

How?  The I/O thread receives A, sends it to the main loop, receives B,
executes it right away, and sends its result.

Meanwhile, the main loop receives A, executes it and sends the result.

Everything is received in the order it is sent.

The two threads race to send result.  That's a feature: it enables B to
jump the queue.

But what about ID-less errors?  Never fear, we actually thought this
out.

The I/O thread needs to parse QMP input some to be able to recognize
exec-oob.  Any parse errors are sent to the main loop, just like
requests.  The main loop receives and processes in order.  Errors it
simply sends, requests it executes, then sends the result.

This ensures that ID-less errors can never jump the queue.

>> With parse errors, it can get a different number of responses.
>> 
>
> Oh, uhm -- if the parse error was bad enough that it didn't even notice 
> we sent it three commands, right?

Yes.

What can a parser do after detecting an error?  Error recovery is
fundamentally guesswork.  The art has a substantial body of lore on
minimizing error cascades.  The basic technique is to silently skip
tokens until a some promising point, then resume parsing in a suitably
altered parser state.

Our error recovery is simplistic.  Happy to explain it in detail if
you're curious.  I figure you're more interested in practical advice
today.  docs/interop/qmp-spec.txt section has some:

    2.6 Forcing the JSON parser into known-good state
    -------------------------------------------------

    Incomplete or invalid input can leave the server's JSON parser in a
    state where it can't parse additional commands.  To get it back into
    known-good state, the client should provoke a lexical error.

    The cleanest way to do that is sending an ASCII control character
    other than '\t' (horizontal tab), '\r' (carriage return), or '\n' (new
    line).

    Sadly, older versions of QEMU can fail to flag this as an error.  If a
    client needs to deal with them, it should send a 0xFF byte.

> we sent it three commands, right? ... So if we see a parse error, we 
> might not know which, if any, of our queued commands were seen or can 
> still expect a response.
>
> That seems difficult to code around, unless I misunderstand.

"Doctor, it hurts when I send QMP input that isn't even JSON!"

> (Every parser error I receive -- which I can "guess" that it's a parser 
> error by the lack of an 'id' field, potentially invalidates the entire 
> queue?)

Yes, with stress on "potentially".  How much gets thrown away depends on
the error.  You know for sure only when you receive a reply with an ID.

>> Questions?
>> 
>
> A few :)

More?

[Prev in Thread]

Current Thread

[Next in Thread]

QMP and the 'id' parameter, John Snow, 2020/11/09
- Re: QMP and the 'id' parameter, Markus Armbruster, 2020/11/10
  - Re: QMP and the 'id' parameter, Daniel P . Berrangé, 2020/11/10
    - Re: QMP and the 'id' parameter, Markus Armbruster, 2020/11/10
  - Re: QMP and the 'id' parameter, John Snow, 2020/11/10
    - Re: QMP and the 'id' parameter, Markus Armbruster <=
    - Re: QMP and the 'id' parameter, John Snow, 2020/11/19
    - Re: QMP and the 'id' parameter, Markus Armbruster, 2020/11/20
    - Re: QMP and the 'id' parameter, John Snow, 2020/11/20
    - Re: QMP and the 'id' parameter, Markus Armbruster, 2020/11/23
    - Re: QMP and the 'id' parameter, John Snow, 2020/11/30

Prev by Date: [PATCH] qemu-keymap.c: Fix bad printf format specifiers
Next by Date: Re: [PATCH-for-6.0 v4 14/17] gitlab-ci: Move trace backend tests across to gitlab
Previous by thread: Re: QMP and the 'id' parameter
Next by thread: Re: QMP and the 'id' parameter
Index(es):
- Date
- Thread