[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: qapi-schema esotera

From: Markus Armbruster
Subject: Re: qapi-schema esotera
Date: Wed, 05 Aug 2020 10:10:05 +0200
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/26.3 (gnu/linux)

John Snow <jsnow@redhat.com> writes:

> On 8/4/20 1:33 AM, Markus Armbruster wrote:
>> John Snow <jsnow@redhat.com> writes:
>>> On 8/3/20 1:25 PM, Eric Blake wrote:
>>>> On 8/3/20 11:49 AM, John Snow wrote:
>>>>> UNION is split into two primary forms:
>>>>> 1. Simple (No discriminator nor base)
>>>>> 2. Flat (Discriminator and base)
>>>>> In expr.py, I notice that we modify the perceived type of the
>>>>> 'type' expression based on the two union forms.
>>>>> 1a. Simple unions allow Array[T]
>>>>> 1b. Flat unions disallow Array[T]
>>>> Rather, branches in a simple unions are syntactic sugar for a
>>>> wrapper struct that contains a single member 'data'; because of that
>>>> extra nesting, the type of that single member is unconstrained.  In
>>>> flat unionw, the type MUST be a QAPI struct, because its members
>>>> will be used inline; as currently coded, this prevents the use of an
>>>> intrinsic type ('int', 'str') or an array type.
>>> I meant syntactically here, to be clear. I'm looking at expr.py -- if
>>> there are deeper constraints on the semantics of the information
>>> provided, that happens later.
>>> Specifically, check_union's use of check_type() changes depending on
>>> the form of the union. One allows a string, the other allows a List of
>>> strings, provided the list is precisely one element long.
>>>> If you need to use an array type in a flat union, you can't do:
>>>> { 'union' ...
>>>>     'data': { 'foo': [ 'MyBranch' ] } }
>>>> but you can provide a wrapper type yourself:
>>>> { 'struct': 'MyBranch', 'data': { 'array': [ 'MyType' ] } }
>>>> { 'union' ...
>>>>     'data': { 'foo': 'MyBranch' } }
>>>>>   From the docs:
>>>>> Syntax:
>>>>>       UNION = { 'union': STRING,
>>>>>                 'data': BRANCHES,
>>>>>                 '*if': COND,
>>>>>                 '*features': FEATURES }
>>>>>             | { 'union': STRING,
>>>>>                 'data': BRANCHES,
>>>>>                 'base': ( MEMBERS | STRING ),
>>>>>                 'discriminator': STRING,
>>>>>                 '*if': COND,
>>>>>                 '*features': FEATURES }
>>>>>       BRANCHES = { BRANCH, ... }
>>>>>       BRANCH = STRING : TYPE-REF
>>>>>              | STRING : { 'type': TYPE-REF, '*if': COND }
>>>>> Both arms use the same "BRANCHES" grammar production, which both
>>>>> use TYPE-REF.
>>>>>       ARRAY-TYPE = [ STRING ]
>>>>> Implying that List[T] should be allowed for both productions.
>>>>> Can I ask for a ruling from the judges?
>>>> As you found, the docs are a bit misleading; the semantic constraint
>>>> on flat union branches being a struct (because they will be inlined)
>>>> prevents the use of type-refs that are valid in simple unions (where
>>>> those simple types will be wrapped in an implicit struct).  A patch
>>>> to improve the docs would be a reasonable idea.
>>> Yes. I was working on a YAML prototype and I am trying to follow the
>>> existing parser as closely as possible. In some cases, this highlights
>>> differences between the grammar as advertised and what the parser
>>> actually does.
>> Please report all such differences, so we can fix them.
> You have been the delightful beneficiary of all doubts thus far, I
> promise. I am not aware of more discrepancies at the moment, but I
> didn't finish my prototype, either.
>>> If we are to keep the current state of things, splitting UNION into
>>> two separate productions might be nice.
>> It *is* two productions, joined with |.
> I ... yes. Technically correct. I had meant separating them out even
> further in the docs, which I suppose implies two top-level construct
> names with how you have the grammar laid out.
> I see you want to get rid of one of these productions, though, so
> don't worry about this thought of mine. We can simplify in the other
> direction.
>> The work unions really, really need is:
>> * Eliminate the simple union sugar.
> What do you mean by "simple union sugar"? Wait, before you answer, let
> me make sure I have the nuances of the forms straight in my head.
> The following is my attempt to summarize what I know about these forms.
> (Please correct me where I am mistaken.)
> ALTERNATE is like an untagged union with no discriminator/tag on the
> wire. I think of a pure C union when I think of this form. The forms
> you can use are limited, based on our ability to differentiate them
> upon parsing.

An alternate type is like a union type, except there is no
discriminator on the wire.  Instead, the branch to use is inferred
from the value.  An alternate can only express a choice between types
represented differently on the wire.

> SIMPLE UNION takes no `discriminator` or `base` parameter in the QAPI
> specification. However, the wire format is not an undifferentiated
> union.
> { 'union': 'foobar',
>   'data': { 'a': 'TypeA',
>             'b': 'TypeB' } }
> Enjoys life at runtime as:
> { "type": ['a' | 'b'],
>   "data": ... }
> (with TypeA or TypeB's definition filling in for the ellipsis as
> denoted by the type field.)


> FLAT UNION has a more complex definitional form. It specifies a base
> type reference by name *or* defined in-line. It also specifies a
> discriminator, which must be an enumerated type in the base.
> For data, it no longer allows you to specify List[T] as a member type.
> For inline definitions of base, it uses a version of type info that
> also allows the FEATURES field.
> (Deep breath).
> So, when you say remove "simple union sugar", do you mean the entirety
> of the tagged union form? What do we replace it by?

A simple union can always be re-written as a flat union where the base
class has a single member named 'type', and where each branch of the
union has a struct with a single member named 'data'.  That is,

 { 'union': 'Simple', 'data': { 'one': 'str', 'two': 'int' } }

is identical on the wire to:

 { 'enum': 'Enum', 'data': ['one', 'two'] }
 { 'struct': 'Branch1', 'data': { 'data': 'str' } }
 { 'struct': 'Branch2', 'data': { 'data': 'int' } }
 { 'union': 'Flat': 'base': { 'type': 'Enum' }, 'discriminator': 'type',
   'data': { 'one': 'Branch1', 'two': 'Branch2' } }

This is from docs/devel/qapi-code-gen.txt.  One to put under your pillow

> (Hardcoded, but compatible flat unions that use "type" field as
> discriminator to ensure backwards compatibility?)


The one reason why I haven't done so already is the notational
overhead.  Therefore:

>> * Make flat unions less cumbersome to write.  I'd like to fuse struct
>>    and union into a single object type, like introspect.json already
>>    does.
> Can you share what you have in mind for how to fuse 'struct' and
> discriminated unions? At the high QAPI grammatical level; no need to
> delve into code generator details.
> (Unless you want to, and then I'll read them.)

An object type similar to a Pascal variant record / Ada discriminated
type: any number of common members, plus any number of variants.  If
there are variants, then there is an additional common member, the tag.

introspect.json already works that way: have a look at SchemaInfoObject.

The part that takes actual thought is the QAPI schema language design:
how can we write such types with much less overhead than flat unions?

Listing the common members in 'base' when there are variants, but in
'data' when there are none, is a complication we can do without.

Sometimes, we want to reuse an existing enumeration type as for the tag.
Sometimes, we'd rather derive one from the variants.

Sometimes, we want to reuse an existing struct type for a variant.
Sometimes, we'd rather define the variant inline.

The spartan lower layer syntax will force some compromises.  For
instance, we can't do inline variants like

    { 'union' : 'InputEvent',
      'data'  : {
        'key: { 'key': 'KeyValue', 'down': 'bool' }
        ... } }

because we need the { } form for specifying properties other than the
type, e.g.

    { 'union': 'BlockdevOptions',
      'data': {
        'replication': { 'type': 'BlockdevOptionsReplication',
                         'if': 'defined(CONFIG_REPLICATION)' },
        ... } }


>> The former is a matter of massaging the schema and simplifying code.
>> The latter requires actual thought.  No big deal, just takes time, and
>> time is always in short supply.
> --js

reply via email to

[Prev in Thread] Current Thread [Next in Thread]