qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v3 10/10] qapi: scripts: add a generator for qapi's examples


From: Victor Toso
Subject: Re: [PATCH v3 10/10] qapi: scripts: add a generator for qapi's examples
Date: Wed, 18 Oct 2023 21:35:41 +0200

Hi Markus,

Sorry the delay on reply here.

On Thu, Sep 21, 2023 at 01:06:01PM +0200, Markus Armbruster wrote:
> Victor Toso <victortoso@redhat.com> writes:
> 
> > This generator has two goals:
> >  1. Mechanical validation of QAPI examples
> >  2. Generate the examples in a JSON format to be consumed for extra
> >     validation.
> >
> > The generator iterates over every Example section, parsing both server
> > and client messages. The generator prints any inconsistency found, for
> > example:
> >
> >  |  Error: Extra data: line 1 column 39 (char 38)
> >  |  Location: cancel-vcpu-dirty-limit at qapi/migration.json:2017
> >  |  Data: {"execute": "cancel-vcpu-dirty-limit"},
> >  |      "arguments": { "cpu-index": 1 } }
> 
> What language does it parse?

It parsers the Example section. Fetches client and server JSON
messages. Validate them.

> Can you give a grammar?

Not sure if needed? I just fetch the json from the example
section and validate it.
 
> Should the parser be integrated into the doc parser, i.e. class QAPIDoc?

Yes, that would be better. I'll try that in the next iteration.

> > The generator will output other JSON file with all the examples in the
> 
> Another JSON file, or other JSON files?

JSON files. QEMU'S QAPI qapi/migration.json will generate a
migration.json with the format mentioned bellow.
> 
> > QAPI module that they came from. This can be used to validate the
> > introspection between QAPI/QMP to language bindings, for example:
> >
> >  | { "examples": [
> >  |   {
> >  |     "id": "ksuxwzfayw",
> >  |     "client": [
> >  |     {
> >  |       "sequence-order": 1
> 
> Missing comma
> 
> >  |       "message-type": "command",
> >  |       "message":
> >  |       { "arguments":
> >  |         { "device": "scratch", "size": 1073741824 },
> >  |         "execute": "block_resize"
> >  |       },
> >  |    } ],
> >  |    "server": [
> >  |    {
> >  |      "sequence-order": 2
> 
> Another one.
> 
> >  |      "message-type": "return",
> >  |      "message": { "return": {} },
> 
> Extra comma.
> 
> >  |    } ]
> >  |    }
> >  |  ] }
> 
> Indentation is kind of weird.

The missing comma was likely my fault on copy & paste.  The
indentation should be what json.dumps() provides, but I think I
changed somehow in the git log too.
 
> The JSON's Valid structure and semantics are not documented.
> We've developed a way specify that, you might've heard of it,
> it's called "QAPI schema" ;-P
> 
> Kidding aside, we've done this before.  E.g. docs/interop/firmware.json
> specifies firmware descriptors.  We have some in pc-bios/descriptors/.

Oh, that's neat. You meant to use QAPI schema as a way to
document output from examples.py?
 
> > Note that the order matters, as read by the Example section and
> > translated into "sequence-order". A language binding project can then
> > consume this files to Marshal and Unmarshal, comparing if the results
> > are what is to be expected.
> >
> > RFC discussion:
> >     https://lists.gnu.org/archive/html/qemu-devel/2022-08/msg04641.html
> >
> > Signed-off-by: Victor Toso <victortoso@redhat.com>
> > ---
> >  scripts/qapi/dumpexamples.py | 208 +++++++++++++++++++++++++++++++++++
> >  scripts/qapi/main.py         |   3 +-
> >  2 files changed, 210 insertions(+), 1 deletion(-)
> >  create mode 100644 scripts/qapi/dumpexamples.py
> >
> > diff --git a/scripts/qapi/dumpexamples.py b/scripts/qapi/dumpexamples.py
> > new file mode 100644
> > index 0000000000..55d9f13ab7
> > --- /dev/null
> > +++ b/scripts/qapi/dumpexamples.py
> 
> Let's name this examples.py.  It already does a bit more than
> dump (namely validate), and any future code dealing with
> examples will likely go into this file.

Ack.

> > @@ -0,0 +1,208 @@
> > +"""
> > +Dump examples for Developers
> > +"""
> > +# Copyright (c) 2023 Red Hat Inc.
> > +#
> > +# Authors:
> > +#  Victor Toso <victortoso@redhat.com>
> > +#
> > +# This work is licensed under the terms of the GNU GPL, version 2.
> 
> We should've insisted on v2+ for the QAPI generator back when we had a
> chance.  *Sigh*

:)
 
> > +# See the COPYING file in the top-level directory.
> > +
> > +# Just for type hint on self
> > +from __future__ import annotations
> > +
> > +import os
> > +import json
> > +import random
> > +import string
> > +
> > +from typing import Dict, List, Optional
> > +
> > +from .schema import (
> > +    QAPISchema,
> > +    QAPISchemaType,
> > +    QAPISchemaVisitor,
> > +    QAPISchemaEnumMember,
> > +    QAPISchemaFeature,
> > +    QAPISchemaIfCond,
> > +    QAPISchemaObjectType,
> > +    QAPISchemaObjectTypeMember,
> > +    QAPISchemaVariants,
> 
> pylint warns
> 
>     scripts/qapi/dumpexamples.py:22:0: W0611: Unused QAPISchemaEnumMember 
> imported from schema (unused-import)
>     scripts/qapi/dumpexamples.py:22:0: W0611: Unused 
> QAPISchemaObjectTypeMember imported from schema (unused-import)
>     scripts/qapi/dumpexamples.py:22:0: W0611: Unused QAPISchemaVariants 
> imported from schema (unused-import)

Yes, now I learned a few tricks around python linters and
formatting. Next iteration will be better.
 
> > +)
> > +from .source import QAPISourceInfo
> > +
> > +
> > +def gen_examples(schema: QAPISchema,
> > +                 output_dir: str,
> > +                 prefix: str) -> None:
> > +    vis = QAPISchemaGenExamplesVisitor(prefix)
> > +    schema.visit(vis)
> > +    vis.write(output_dir)
> 
> The other backends have this at the end of the file.  Either order
> works, but consistency is nice.

Ok.
 
> > +
> > +
> > +def get_id(random, size: int) -> str:
> 
> pylint warns
> 
>     scripts/qapi/dumpexamples.py:44:11: W0621: Redefining name
>     'random' from outer scope (line 17) (redefined-outer-name)
> 
> > +    letters = string.ascii_lowercase
> > +    return ''.join(random.choice(letters) for i in range(size))
> > +
> > +
> > +def next_object(text, start, end, context) -> (Dict, bool):
> > +    # Start of json object
> > +    start = text.find("{", start)
> > +    end = text.rfind("}", start, end+1)
> 
> Single quotes, please, for consistency with quote use elsewhere.
> 
> Rules of thumb:
> 
> * Double quotes for english text (e.g. error messages), so we don't have
>   to escape apostrophes.
> 
> * Double quotes when they reduce the escaping
> 
> * Else single quotes
> 
> More elsewhere, not flagged.
> 
> > +
> > +    # try catch, pretty print issues
> 
> Is this comment useful?

I'll remove.
 
> > +    try:
> > +        ret = json.loads(text[start:end+1])
> > +    except Exception as e:
> 
> pylint warns
> 
>     scripts/qapi/dumpexamples.py:57:11: W0703: Catching too general exception 
> Exception (broad-except)
> 
> Catch JSONDecodeError?

Ack.

> > +        print("Error: {}\nLocation: {}\nData: {}\n".format(
> > +              str(e), context, text[start:end+1]))
> 
> Errors need to go to stderr.
> 
> Have you considered using QAPIError to report these errors?

Did not cross my mind, no. I'll take a look.
 
> > +        return {}, True
> > +    else:
> > +        return ret, False
> > +
> > +
> > +def parse_text_to_dicts(text: str, context: str) -> (List[Dict], bool):
> 
> Before I review the parser, I'd like to know the (intended) language
> being parsed.

Ok, I'll add documentation about it in the next iteration.
 
> > +    examples, clients, servers = [], [], []
> > +    failed = False
> > +
> > +    count = 1
> > +    c, s = text.find("->"), text.find("<-")
> > +    while c != -1 or s != -1:
> > +        if c == -1 or (s != -1 and s < c):
> > +            start, target = s, servers
> > +        else:
> > +            start, target = c, clients
> > +
> > +        # Find the client and server, if any
> > +        if c != -1:
> > +            c = text.find("->", start + 1)
> > +        if s != -1:
> > +            s = text.find("<-", start + 1)
> > +
> > +        # Find the limit of current's object.
> > +        # We first look for the next message, either client or server. If 
> > none
> > +        # is avaible, we set the end of the text as limit.
> > +        if c == -1 and s != -1:
> > +            end = s
> > +        elif c != -1 and s == -1:
> > +            end = c
> > +        elif c != -1 and s != -1:
> > +            end = (c < s) and c or s
> 
> pylint advises
> 
>     scripts/qapi/dumpexamples.py:91:12: R1706: Consider using ternary (c if c 
> < s else s) (consider-using-ternary)
> 
> > +        else:
> > +            end = len(text) - 1
> > +
> > +        message, error = next_object(text, start, end, context)
> > +        if error:
> > +            failed = True
> > +
> > +        if len(message) > 0:
> > +            message_type = "return"
> > +            if "execute" in message:
> > +                message_type = "command"
> > +            elif "event" in message:
> > +                message_type = "event"
> > +
> > +            target.append({
> > +                "sequence-order": count,
> > +                "message-type": message_type,
> > +                "message": message
> > +            })
> > +            count += 1
> > +
> > +    examples.append({"client": clients, "server": servers})
> > +    return examples, failed
> > +
> > +
> > +def parse_examples_of(self: QAPISchemaGenExamplesVisitor,
> 
> Uh, shouldn't this be a method of QAPISchemaGenExamplesVisitor?

Indeed.
 
> > +                      name: str):
> > +
> > +    assert(name in self.schema._entity_dict)
> 
> Makes both pycodestyle and pylint unhappy.  Better:
> 
>        assert name in self.schema._entity_dict
> 
> pylint warns
> 
>     scripts/qapi/dumpexamples.py:120:19: W0212: Access to a protected member 
> _entity_dict of a client class (protected-access)
> 
> here and two more times below.

Thanks, I'll have all of those fixed.
 
> > +    obj = self.schema._entity_dict[name]
> > +
> > +    if not obj.info.pragma.doc_required:
> > +        return
> > +
> > +    assert((obj.doc is not None))
> 
> Better:
> 
>        assert obj.doc is not None
> 
> > +    module_name = obj._module.name
> > +
> > +    # We initialize random with the name so that we get consistent example
> > +    # ids over different generations. The ids of a given example might
> > +    # change when adding/removing examples, but that's acceptable as the
> > +    # goal is just to grep $id to find what example failed at a given test
> > +    # with minimum chorn over regenerating.
> 
> churn from?

There is one id per example section. The idea of having an id is
that, if a test failed, we can easily find what test failed.

The comment tries to clarify that the goal is the id to be kept
intact (hence, we seed from its name), reducing the churn over
regenerating the output.
 
> > +    random.seed(name, version=2)
> 
> You're reinitializing the global PRNG.  Feels unclean.  Create your own
> here?

I don't see much a problem with it, but sure, feels like local
would be cleaner indeed.
 
> > +
> > +    for s in obj.doc.sections:
> > +        if s.name != "Example":
> 
> docs/devel/qapi-code-gen.rst section "Definition documentation":
> 
>     A tagged section starts with one of the following words:
>     "Note:"/"Notes:", "Since:", "Example"/"Examples", "Returns:", "TODO:".
>     The section ends with the start of a new section.
> 
> You're missing "Examples".

Aha! I see several tests that I'm skipping, thanks! (19 Examples
sections)

> docs/sphinx/qapidoc.py uses s.name.startswith('Example').
> 
> > +            continue
> > +
> > +        if module_name not in self.target:
> > +            self.target[module_name] = []
> > +
> > +        context = f'''{name} at {obj.info.fname}:{obj.info.line}'''
> > +        examples, failed = parse_text_to_dicts(s.text, context)
> > +        if failed:
> > +            # To warn user that docs needs fixing
> > +            self.failed = True
> > +
> > +        for example in examples:
> > +            self.target[module_name].append({
> > +                    "id": get_id(random, 10),
> > +                    "client": example["client"],
> > +                    "server": example["server"]
> > +            })
> > +
> > +
> > +class QAPISchemaGenExamplesVisitor(QAPISchemaVisitor):
> > +
> > +    def __init__(self, prefix: str):
> > +        super().__init__()
> > +        self.target = {}
> 
> @target maps what to what?

Yes, lacking type hint and some documentation.

It is a dictionary that maps "filename" to an array of the
undocumented object. Each object has 3 fields:
    * "id"     : defines unique id for each test
    * "client" : object that wraps message sent by client (->)
    * "server" : object that wraps message sent by server (<-)

About client and server object, they contain 3 fields:
    * "message"      : The actual payload from the examples
                       section
    * "message-type" : Either "command", "return" or "event"
    * "sequence-order: (int) what index it had in the sequence of
                        examples section.

I'll have it properly documented.
 
> > +        self.schema = None
> > +        self.failed = False
> > +
> > +    def visit_begin(self, schema):
> > +        self.schema = schema
> > +
> > +    def visit_end(self):
> > +        self.schema = None
> > +        assert not self.failed, "Should fix the docs"
> 
> Unless I'm misreading the code, this asserts "all the examples parse
> fine."  Misuse of assert for reporting errors.
> 
> > +
> > +    def write(self: QAPISchemaGenExamplesVisitor,
> > +              output_dir: str) -> None:
> > +        for filename, content in self.target.items():
> > +            pathname = os.path.join(output_dir, "examples", filename)
> > +            odir = os.path.dirname(pathname)
> > +            os.makedirs(odir, exist_ok=True)
> > +            result = {"examples": content}
> > +
> > +            with open(pathname, "w") as outfile:
> 
> pylint warns
> 
>     scripts/qapi/dumpexamples.py:180:17: W1514: Using open without explicitly 
> specifying an encoding (unspecified-encoding)
> 
> Recommend to pass encoding='utf=8'.

Ack.

> > +                outfile.write(json.dumps(result, indent=2, sort_keys=True))
> > +
> > +    def visit_command(self: QAPISchemaGenExamplesVisitor,
> > +                      name: str,
> > +                      info: Optional[QAPISourceInfo],
> > +                      ifcond: QAPISchemaIfCond,
> > +                      features: List[QAPISchemaFeature],
> > +                      arg_type: Optional[QAPISchemaObjectType],
> > +                      ret_type: Optional[QAPISchemaType],
> > +                      gen: bool,
> > +                      success_response: bool,
> > +                      boxed: bool,
> > +                      allow_oob: bool,
> > +                      allow_preconfig: bool,
> > +                      coroutine: bool) -> None:
> > +
> > +        if gen:
> > +            parse_examples_of(self, name)
> 
> Why only if gen?

I honestly don't remember. I'll test it out and document it
properly.
 
> > +
> > +    def visit_event(self: QAPISchemaGenExamplesVisitor,
> > +                    name: str,
> > +                    info: Optional[QAPISourceInfo],
> > +                    ifcond: QAPISchemaIfCond,
> > +                    features: List[QAPISchemaFeature],
> > +                    arg_type: Optional[QAPISchemaObjectType],
> > +                    boxed: bool):
> > +
> > +        parse_examples_of(self, name)
> 
> Examples in definition comments for types are silently ignored.  Should
> we do something with them?

The more the merrier. I'll tackle it in the next iteration too.
 
> > diff --git a/scripts/qapi/main.py b/scripts/qapi/main.py
> > index 316736b6a2..9482439fa9 100644
> > --- a/scripts/qapi/main.py
> > +++ b/scripts/qapi/main.py
> > @@ -13,6 +13,7 @@
> >  
> >  from .commands import gen_commands
> >  from .common import must_match
> > +from .dumpexamples import gen_examples
> >  from .error import QAPIError
> >  from .events import gen_events
> >  from .introspect import gen_introspect
> > @@ -53,7 +54,7 @@ def generate(schema_file: str,
> >      gen_commands(schema, output_dir, prefix, gen_tracing)
> >      gen_events(schema, output_dir, prefix)
> >      gen_introspect(schema, output_dir, prefix, unmask)
> > -
> > +    gen_examples(schema, output_dir, prefix)
> >  
> >  def main() -> int:
> >      """
> 
> You provide some type annotations, but mypy isn't happy:
> 
>     $ mypy --config-file scripts/qapi/mypy.ini scripts/qapi/dumpexamples.py 
>     scripts/qapi/parser.py:566: error: Function is missing a return type 
> annotation
>     scripts/qapi/parser.py:570: error: Function is missing a return type 
> annotation
>     scripts/qapi/dumpexamples.py:44: error: Function is missing a type 
> annotation for one or more arguments
>     scripts/qapi/dumpexamples.py:49: error: Function is missing a type 
> annotation for one or more arguments
>     scripts/qapi/dumpexamples.py:49: error: Syntax error in type annotation
>     scripts/qapi/dumpexamples.py:49: note: Suggestion: Use Tuple[T1, ..., Tn] 
> instead of (T1, ..., Tn)
>     scripts/qapi/dumpexamples.py:65: error: Syntax error in type annotation
>     scripts/qapi/dumpexamples.py:65: note: Suggestion: Use Tuple[T1, ..., Tn] 
> instead of (T1, ..., Tn)
>     scripts/qapi/dumpexamples.py:66: error: Need type annotation for 
> "clients" (hint: "clients: List[<type>] = ...")
>     scripts/qapi/dumpexamples.py:66: error: Need type annotation for 
> "servers" (hint: "servers: List[<type>] = ...")
>     scripts/qapi/dumpexamples.py:117: error: Function is missing a return 
> type annotation
>     scripts/qapi/dumpexamples.py:120: error: "None" has no attribute 
> "_entity_dict"
>     scripts/qapi/dumpexamples.py:121: error: "None" has no attribute 
> "_entity_dict"
>     scripts/qapi/dumpexamples.py:161: error: Need type annotation for 
> "target" (hint: "target: Dict[<type>, <type>] = ...")
>     scripts/qapi/dumpexamples.py:165: error: Function is missing a type 
> annotation
>     scripts/qapi/dumpexamples.py:168: error: Function is missing a return 
> type annotation
>     scripts/qapi/dumpexamples.py:168: note: Use "-> None" if function does 
> not return a value
>     scripts/qapi/dumpexamples.py:200: error: Function is missing a return 
> type annotation
>     Found 15 errors in 2 files (checked 1 source file)
> 
> I think before I dig deeper, we should discuss my findings so far.

Yes, I think I agreed with mostly of your suggestions. I'm
learning how to write proper python, so sorry that we saw many
basic lint failures here.
 
> Here's my .pylintrc, in case you want to run pylint yourself:
> 
> disable=
>     consider-using-f-string,
>     fixme,
>     invalid-name,
>     missing-docstring,
>     too-few-public-methods,
>     too-many-arguments,
>     too-many-branches,
>     too-many-instance-attributes,
>     too-many-lines,
>     too-many-locals,
>     too-many-statements,
>     unused-argument,
>     unused-wildcard-import,

Thanks again for the review, I really appreciate it.

Cheers,
Victor

Attachment: signature.asc
Description: PGP signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]