qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH v5 01/28] qapi: Document type-safety considerati


From: Markus Armbruster
Subject: Re: [Qemu-devel] [PATCH v5 01/28] qapi: Document type-safety considerations
Date: Thu, 26 Mar 2015 09:09:41 +0100
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/24.3 (gnu/linux)

Eric Blake <address@hidden> writes:

> On 03/25/2015 12:31 PM, Markus Armbruster wrote:
>> Eric Blake <address@hidden> writes:
>> 
>>> Go into more details about the various types of valid expressions
>>> in a qapi schema, including tweaks to document fixes being done
>>> later in the current patch series.  Also fix some stale and missing
>>> documentation in the QMP specification.
>>>
>>> Signed-off-by: Eric Blake <address@hidden>
>>>
>>> ---
>>>
>>> I'm not sure if it is okay to assert GPLv2+ licensing without an
>>> explicit Copyright, but as I am not the original author, I don't
>>> know who to attribute any original Copyright to.  Advice?  Should
>>> I split the license addition to a separate patch?
>
> No thoughts to this question?

Separate message.

>>>
>>> +There are six top-level expressions recognized by the parser:
>>> +'include', 'command', 'type', 'enum', 'union', and 'event'.  There are
>>> +several built-in types, such as 'int' and 'str'; additionally, the
>>> +top-level expressions can define complex types, enumeration types, and
>>> +several flavors of union types.  The 'command' expression can refer to
>>> +existing types by name, or list an anonymous type as a dictionary.
>> 
>> 'event' can do that, too.
>
> Yep. I can fix that on top if no respin is needed.
>
>
>>> +Types, commands, and events share a common namespace.  Therefore,
>>> +generally speaking, type definitions should always use CamelCase for
>>> +user-defined type names, while built-in types are lowercase. Type
>>> +definitions should not end in 'Kind', as this namespace is used for
>>> +creating implicit C enums for visiting union types.  Command names,
>>> +and field names within a type, should be all lower case with words
>>> +separated by a hyphen.  However, some existing older commands and
>>> +complex types use underscore; when extending such expressions,
>>> +consistency is preferred over blindly avoiding underscore.  Event
>>> +names should be ALL_CAPS with words separated by underscore.  The
>>> +special string '**' appears for some commands that manually perform
>>> +their own type checking rather than relying on the type-safe code
>>> +produced by the qapi code generators.
>> 
>> The generator generates C identifiers pretty thoughtlessly, and is
>> therefore prone to generate clashes.  Fixable, but we've got bigger fish
>> to fry, and this series is hefty enough as it is.  Documenting the mess
>> instead makes sense.
>
> And I actually DO tighten up the parser to flag some of these problems
> later in the series.
>
>
>>>  The QAPI schema definitions can be modularized using the 'include'
>>> directive:
>>>
>>>   { 'include': 'path/to/file.json'}
>> 
>> If you need to respin for some reason, insert a space before the closing
>> brace.
>
> Sure; added to my on-top-or-respin pile.
>
>
>>>
>>> -A complex type is a dictionary containing a single key whose value is a
>>> -dictionary.  This corresponds to a struct in C or an Object in JSON.  An
>>> -example of a complex type is:
>>> +Usage: { 'type': 'str', 'data': 'dict', '*base': 'complex-type-name' }
>> 
>> Here, the reader needs to get that 'type' and 'data' and 'base' are
>> literals, but 'str', 'dict' and 'complex-type-name' are placeholders.  I
>> like setting apart placeholders with typograhic conventions.  If we want
>> to do that, let's do it on top.
>
> I was trying to borrow from qapi syntax, in that the 'data' dictionary
> uses '*name':'type' for a literal "name" coupled with a
> value-of-that-type pair in the dictionary sent over the wire for
> "arguments".  As for a typographical convention, would either of these
> be any easier to read?
>
> Usage: { 'type': 'STR', 'data': 'DICT', '*base': 'COMPLEX-TYPE-NAME' }
>
> or
>
> Usage: { 'type': string, 'data': dict [, 'base': complex-type-name] }

Or

  Usage: { 'type': STRING, 'data': DICT [, 'base': COMPLEX-TYPE-NAME] }

Makes the placeholders stand out.  Unambiguous as long as we don't use
ALL-CAPS for anything else outside quotes.

Using QAPI's *-prefix instead of EBNF's brackets

  Usage: { 'type': STRING, 'data': DICT , '*base': COMPLEX-TYPE-NAME }

would also work, but I feel the *-prefix needs to be explained earlier
then.

>> 
>> '*base' is interesting.  It's a tacit use of the schema language's
>> "'*name' means member 'name' is optional" rule on meta-schema-level.  We
>> can take care of that when/if we clarify placeholders.
>> 
>>> +
>>> +A complex type is a dictionary containing a single 'data' key whose
>>> +value is a dictionary.  This corresponds to a struct in C or an Object
>>> +in JSON. Each value of the 'data' dictionary must be the name of a
>>> +complex, enum, union, or built-in type,
>> 
>> Are there any others?  If not, we can simplify to
>> 
>>     Each value of the 'data' dictionary must be a type name
>
> Works for me, especially since the addition of 'alternate' later in this
> series is also a type that works in this context.
>
>> 
>>>                                          or a one-element array
>>> +containing a type name.  An example of a complex type is:
>>>
>>>   { 'type': 'MyType',
>>>     'data': { 'member1': 'str', 'member2': 'int', '*member3': 'str' } }
>>>
>>> -The use of '*' as a prefix to the name means the member is optional.
>>> +The use of '*' as a prefix to the name means the member is optional in
>>> +the corresponding QMP usage.
>> 
>> Not just in QMP, also QGA and (internal) C interfaces.  Since explaining
>> that is tiresome, I'd be tempted to stick to the original sentence here.
>> 
>> Preexisting: we sometimes say "QMP wire format", and sometimes simply
>> "JSON".  Again, we got bigger fish to fry.
>
> If I do anything for this, it will be a separate patch on top scrubbing
> for all uses.

Absolutely.

>>> +Additionally, an implicit C enum NameKind is created, corresponding to
>>> +the union Name, for accessing the various branches of the union.  No
>>> +branch of the union can be named 'max', as this would collide with the
>>> +implicit enum.
>> 
>> Aside: I guess this implicit enum is just like a normal QAPI enum,
>> except it doesn't have a name in the schema.
>
> Implemented identically; and why I want to eventually add:
>
> { 'enum': 'Enum', 'data': [ 'one', 'two' ] }
> { 'union': 'Name', 'discriminator': 'Enum',
>   'data': { 'one': 'str', 'two': 'int' } }
> { 'command': 'foo', 'data': 'Name' }
>
> to allow this on the wire:
>
> { "execute": "foo", "arguments": { "type": "one", "data": "string" } }
> { "execute": "foo", "arguments": { "type": "two", "data": 1 } }
>
>>>   { 'type': 'BlockdevCommonOptions', 'data': { 'readonly': 'bool' } }
>>>   { 'union': 'BlockdevOptions',
>> 
>> Worth mentioning explicitly that the base type's members are common to
>> all the union's cases?
>
> Sure.
>
>
>>> +The final flavor of unions is an anonymous union. While the other two
>>> +union types are always passed as a dictionary in the wire format, an
>>> +anonymous union instead allows the direct use of different types in
>>> +its place. As they aren't structured, they don't have any explicit
>>> +discriminator but use the (QObject) data type of their value as an
>>> +implicit discriminator. This means that they are restricted to using
>>> +only one discriminator value per QObject type. For example, you cannot
>>> +have two different complex types in an anonymous union, or two
>>> +different integer types.
>> 
>> This is the first mention of "QObject type".  How it's related to JSON
>> is left to the reader's deductive skills :)
>> 
>> The QObject types are QTYPE_NONE, QTYPE_QINT, QTYPE_QSTRING,
>> QTYPE_QDICT, QTYPE_QLIST, QTYPE_QFLOAT, QTYPE_QBOOL, QTYPE_QERROR.
>> 
>> The connections JSON string - QTYPE_QSTRING, JSON object - QTYPE_QDICT,
>> JSON array - QTYPE_QLIST and JSON boolean - QTYPE_QBOOL are obvious
>> enough.
>> 
>> If I remember correctly, our JSON parser chokes on the JSON keyword
>> null.
>> 
>> That leaves just JSON numbers - QTYPE_QINT or QTYPE_QFLOAT.  Can an
>> anonymous union have a separate case for each of the two?
>> 
>> For completeness: on the QTYPE_ side, it leaves QTYPE_QERROR and
>> QTYPE_NONE.  QTYPE_QERROR is internal only, and will hopefully be gone
>> soon.  I can't see QTYPE_NONE objects being created anywhere.
>> 
>> Should we explain this in terms of JSON types instead of QObject types?
>
> Yeah, makes sense.  Sounds like my on-top patch is getting bigger.
>
>> 
>>>
>>>  Anonymous unions are declared using an empty dictionary as their
>>> discriminator.
>>>  The discriminator values never appear on the wire, they are only used in 
>>> the
>>> @@ -200,23 +350,93 @@ This example allows using both of the
>>> following example objects:
>>>
>>>  === Commands ===
>>>
>>> -Commands are defined by using a list containing three members.  The first
>>> -member is the command name, the second member is a dictionary containing
>>> -arguments, and the third member is the return type.
>>> +Usage: { 'command': 'str', '*data': 'dict-or-complex-type-name',
>>> +         '*returns': 'type',
>> 
>> Shoudn't this be '*returns': 'dict-or-type-name'?
>
> Yep.  Good catch.
>
>> 
>>> +         '*gen': false, '*success-response': false }
>>>
>>> -An example command is:
>>> +Commands are defined by using a dictionary containing several members,
>>> +where three members are most common.  The 'command' member is a
>>> +mandatory string, and determines the "execute" value passed in a QMP
>>> +command exchange.
>>> +
>>> +The 'data' member is optional; if absent, the command accepts an
>>> +optional empty dictionary.
>> 
>> Suggest: The 'data' member is optional, and defaults to {}.
>> 
>>> +                            If present, it must be the string name of
>>> +a complex type, a one-element array containing the name of a complex
>>> +type, or a dictionary that declares an anonymous type with the same
>>> +semantics as a 'type' expression, with one exception noted below when
>>> +'gen' is used.  The 'data' argument maps to the "arguments" dictionary
>>> +passed in as part of a QMP command.
>> 
>> Suggest to move the last sentence to the beginning of the paragaph.
>
> Good ideas.
>
>> 
>>> +
>>> +The 'returns' member describes what will appear in the "return" field
>>> +of a QMP reply on successful completion of a command.  The member is
>>> +optional from the command declaration; if absent, the "return" field
>>> +will be an empty dictionary.  If 'returns' is present, it must be the
>>> +string name of a complex or built-in type, a one-element array
>>> +containing the name of a complex or built-in type, or a dictionary
>>> +that declares an anonymous type with the same semantics as a 'type'
>>> +expression, with one exception noted below when 'gen' is used.
>> 
>> Oh, 'gen' affect 'returns', too?  Live and learn...
>
> Yes; thanks to 'qom-get'.
>
>
>>> +
>>> + { 'command': 'netdev_add',
>>> +   'data': {'type': 'str', 'id': 'str', '*props': '**'},
>>> +   'gen': false }
>> 
>> We use 'gen': 'no' until PATCH 18.
>
> Yeah.  I debated about that one, whether to do all my docs to their
> final state up front, or whether to doc existing practice and then tweak
> it to be more specific later in the series as fixes were made.  I guess
> you can see which way I chose.  And of course, if I do more followups on
> top of the series, I no longer have all changes in one place.  Oh well;
> doc changes don't affect bisection :)
>
> (well, I did more docs later when creating 'alternate' in place of
> anonymous union, but still limited to two patches)

Not enough OCD to make you evolve the documentation in lockstep with the
code 100% ;-)

>>> +
>>> +Normally, the QAPI schema is used to describe synchronous exchanges,
>>> +where a response is expected.  But in some cases, the action of a
>>> +command is expected to change state in a way that a successful
>>> +response is not possible (the command still returns a normal
>>> +dictionary error on failure).  When a successful reply is not
>>> +possible, the command expression should include the optional key
>>> +'success-response' with boolean value false.  So far, only the
>>> +qemu-guest-agent makes use of this field.
>> 
>> I had no idea :)
>
> You said that last time :)
> https://lists.gnu.org/archive/html/qemu-devel/2014-09/msg04254.html

I guess we're approaching the state where we'll have forgotten more
about the QAPI schema language than the average QEMU developer needs to
know...

>>>
>>>  http://www.ietf.org/rfc/rfc4627.txt
>> 
>> Upgrade to RFC 7159?
>
> Sure; wasn't even aware it was out there.

Me neither, until I stumbled over it the other day.

> /me reads it
>
> Looks like it added hints about objects being true dictionaries
> (repeating a name leads to unpredictable results, you can't rely on
> order being preserved in the dictionary by default), clarified that
> arrays need not be homogeneous types; gave more details on the limits of
> numbers, spoke more about consequences of encoding in other than UTF-8,
> and required that escape sequences be normalized before comparing
> strings for equality.  None of those should really hurt our usage (well,
> we DO document that our use of json-object for 'data' member is
> special-cased in that order is significant to the generated C code but
> not to the QMP wire format).
>
>
>>>
>>>  The format of a success response is:
>>>
>>> -{ "return": json-object, "id": json-value }
>>> +{ "return": json-entity, "id": json-value }
>> 
>> Unlike the other json-FOOs we use, "entity" isn't defined in RFC4627.
>> "value" is, and we already use json-value.  What's the difference
>> between the two?
>
> Umm, when I wrote that, I was thinking "id":json-value meant integer, so
> I wanted something that was not an integer.  But sure enough, json-value
> is precisely the term I wanted to use:
>
> $ qemu-kvm -qmp stdio -nodefaults
> {"QMP": {"version": {"qemu": {"micro": 90, "minor": 2, "major": 2},
> "package": " (qemu-2.3.0-0.1.rc0.fc21)"}, "capabilities": []}}
> {"execute":"qmp_capabilities","id":[]}
> {"return": {}, "id": []}
> {"execute":"huh", "id":{"a":1,"b":true}}
> {"id": {"a": 1, "b": true}, "error": {"class": "CommandNotFound",
> "desc": "The command huh has not been found"}}
>
> So _I_ learned something (the id can be anything at all!)
>
> Looks like I have some wording improvements to add.
>
>> 
>>>
>>>   Where,
>>>
>>> -- The "return" member contains the command returned data, which is defined
>>> -  in a per-command basis or an empty json-object if the command does not
>>> -  return data
>>> +- The "return" member contains the data returned by the command, which
>>> +  is defined on a per-command basis (usually a json-object or
>>> +  json-array of json-objects, but sometimes a json-value, json-string,
>>> +  or json-array of json-strings); it is an empty json-object if the
>>> +  command does not return data
>> 
>> Err, aren't json-object, -string, -array all json-value?  At least
>> that's how the JSON RFC uses "value".
>
> Yep.
>
>> 
>> Massive improvement.  We can always improve some more on top, thus:
>> 
>> Reviewed-by: Markus Armbruster <address@hidden>
>
> Thank you.  If there's cause for respin, I can fold in some improvements
> then; otherwise, I'm already working on putting my changes into a
> followup patch.

Followup patch works for me.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]