qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH v7 15/21] qapi: add qapi2texi script


From: Markus Armbruster
Subject: Re: [Qemu-devel] [PATCH v7 15/21] qapi: add qapi2texi script
Date: Thu, 12 Jan 2017 14:17:04 +0100
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/25.1 (gnu/linux)

Marc-André Lureau <address@hidden> writes:

> Hi
>
> ----- Original Message -----
>> Marc-André Lureau <address@hidden> writes:
>> 
>> > As the name suggests, the qapi2texi script converts JSON QAPI
>> > description into a texi file suitable for different target
>> > formats (info/man/txt/pdf/html...).
>> >
>> > It parses the following kind of blocks:
>> >
>> > Free-form:
>> >
>> >   ##
>> >   # = Section
>> >   # == Subsection
>> >   #
>> >   # Some text foo with *emphasis*
>> >   # 1. with a list
>> >   # 2. like that
>> >   #
>> >   # And some code:
>> >   # | $ echo foo
>> >   # | -> do this
>> >   # | <- get that
>> >   #
>> >   ##
>> >
>> > Symbol description:
>> >
>> >   ##
>> >   # @symbol:
>> >   #
>> >   # Symbol body ditto ergo sum. Foo bar
>> >   # baz ding.
>> >   #
>> >   # @param1: the frob to frobnicate
>> >   # @param2: #optional how hard to frobnicate
>> >   #
>> >   # Returns: the frobnicated frob.
>> >   #          If frob isn't frobnicatable, GenericError.
>> >   #
>> >   # Since: version
>> >   # Notes: notes, comments can have
>> >   #        - itemized list
>> >   #        - like this
>> >   #
>> >   # Example:
>> >   #
>> >   # -> { "execute": "quit" }
>> >   # <- { "return": {} }
>> >   #
>> >   ##
>> >
>> > That's roughly following the following EBNF grammar:
>> >
>> > api_comment = "##\n" comment "##\n"
>> > comment = freeform_comment | symbol_comment
>> > freeform_comment = { "# " text "\n" | "#\n" }
>> > symbol_comment = "# @" name ":\n" { member | tag_section | freeform_comment
>> > }
>> > member = "# @" name ':' [ text ] "\n" freeform_comment
>> > tag_section = "# " ( "Returns:", "Since:", "Note:", "Notes:", "Example:",
>> > "Examples:" ) [ text ]  "\n" freeform_comment
>> > text = free text with markup
>> >
>> > Note that the grammar is ambiguous: a line "# @foo:\n" can be parsed
>> > both as freeform_comment and as symbol_comment.  The actual parser
>> > recognizes symbol_comment.
>> >
>> > See docs/qapi-code-gen.txt for more details.
>> >
>> > Deficiencies:
>> 
>> Perhaps "Deficiencies and limitations".
>
> ok
>
>> 
>> > - the generated QMP documentation includes internal types
>> > - union-type support is lacking
>> 
>> "union type" (no dash).
>> 
>> > - type information is lacking in generated documentation
>> > - doc comment error message positions are imprecise, they point
>> >   to the beginning of the comment.
>> > - see other TODO/FIXME in this commit
>> 
>> Suggest: - a few minor issues, all marked TODO/FIXME in the code
>> 
>
> ok
>
>> >
>> > Signed-off-by: Marc-André Lureau <address@hidden>
>> > ---
>> [diffstat snipped...]
>> > diff --git a/scripts/qapi.py b/scripts/qapi.py
>> > index 3d5f9e1eaf..a92a86f428 100644
>> > --- a/scripts/qapi.py
>> > +++ b/scripts/qapi.py
>> > @@ -125,6 +125,122 @@ class QAPISemError(QAPIError):
>> >                             info['parent'], msg)
>> >  
>> >  
>> > +class QAPIDoc(object):
>> > +    class Section(object):
>> > +        def __init__(self, name=None):
>> > +            # optional section name (argument/member or section name)
>> > +            self.name = name
>> > +            # the list of lines for this section
>> > +            self.content = []
>> > +
>> > +        def append(self, line):
>> > +            self.content.append(line)
>> > +
>> > +        def __repr__(self):
>> > +            return "\n".join(self.content).strip()
>> > +
>> > +    class ArgSection(Section):
>> > +        pass
>> > +
>> > +    def __init__(self, parser, info):
>> > +        # self.parser is used to report errors with QAPIParseError.  The
>> > +        # resulting error position depends on the state of the parser.
>> > +        # It happens to be the beginning of the comment.  More or less
>> > +        # servicable, but action at a distance.
>> > +        self.parser = parser
>> > +        self.info = info
>> > +        self.symbol = None
>> > +        self.body = QAPIDoc.Section()
>> > +        # dict mapping parameter name to ArgSection
>> > +        self.args = OrderedDict()
>> > +        # a list of Section
>> > +        self.sections = []
>> > +        # the current section
>> > +        self.section = self.body
>> > +        # associated expression (to be set by expression parser)
>> > +        self.expr = None
>> > +
>> > +    def has_section(self, name):
>> > +        """Return True if we have a section with this name."""
>> > +        for i in self.sections:
>> > +            if i.name == name:
>> > +                return True
>> > +        return False
>> > +
>> > +    def append(self, line):
>> > +        """Parse a comment line and add it to the documentation."""
>> > +        line = line[1:]
>> > +        if not line:
>> > +            self._append_freeform(line)
>> > +            return
>> > +
>> > +        if line[0] != ' ':
>> > +            raise QAPIParseError(self.parser, "Missing space after #")
>> > +        line = line[1:]
>> > +
>> > +        # FIXME not nice: things like '#  @foo:' and '# @foo: ' aren't
>> > +        # recognized, and get silently treated as ordinary text
>> > +        if self.symbol:
>> > +            self._append_symbol_line(line)
>> > +        elif not self.body.content and line.startswith("@"):
>> > +            if not line.endswith(":"):
>> > +                raise QAPIParseError(self.parser, "Line should end with
>> > :")
>> > +            self.symbol = line[1:-1]
>> > +            # FIXME invalid names other than the empty string aren't
>> > flagged
>> > +            if not self.symbol:
>> > +                raise QAPIParseError(self.parser, "Invalid name")
>> > +        else:
>> > +            self._append_freeform(line)
>> > +
>> > +    def _append_symbol_line(self, line):
>> > +        name = line.split(' ', 1)[0]
>> > +
>> > +        if name.startswith("@") and name.endswith(":"):
>> > +            line = line[len(name)+1:]
>> > +            self._start_args_section(name[1:-1])
>> > +        elif name in ("Returns:", "Since:",
>> > +                      # those are often singular or plural
>> > +                      "Note:", "Notes:",
>> > +                      "Example:", "Examples:",
>> > +                      "TODO:"):
>> > +            line = line[len(name)+1:]
>> > +            self._start_section(name[:-1])
>> > +
>> > +        self._append_freeform(line)
>> > +
>> > +    def _start_args_section(self, name):
>> > +        # FIXME invalid names other than the empty string aren't flagged
>> > +        if not name:
>> > +            raise QAPIParseError(self.parser, "Invalid parameter name")
>> > +        if name in self.args:
>> > +            raise QAPIParseError(self.parser,
>> > +                                 "'%s' parameter name duplicated" % name)
>> > +        if self.sections:
>> > +            raise QAPIParseError(self.parser,
>> > +                                 "'%s' parameter documentation is
>> > interleaved "
>> > +                                 "with other sections" % name)
>> 
>> This error message is rather confusing.  Ideally, we'd point to the the
>> source code that made us add to self.sections, but that's not in the
>> cards right now.  Here's my best try:
>> 
>>                raise QAPIParseError(self.parser,
>>                                     "'@%s:' can't follow '%s' section"
>>                                     % (name, self.sections[0].name))
>> 
>
> works for me too
>
>> > +        self.section = QAPIDoc.ArgSection(name)
>> > +        self.args[name] = self.section
>> > +
>> > +    def _start_section(self, name=""):
>> 
>> Note: @name changed to default to "" since v6 for the benefit of
>> _append_freeform() below.
>> 
>> > +        if name in ("Returns", "Since") and self.has_section(name):
>> > +            raise QAPIParseError(self.parser,
>> > +                                 "Duplicated '%s' section" % name)
>> > +        self.section = QAPIDoc.Section(name)
>> > +        self.sections.append(self.section)
>> > +
>> > +    def _append_freeform(self, line):
>> > +        in_arg = isinstance(self.section, QAPIDoc.ArgSection)
>> > +        if (in_arg and self.section.content and
>> > +                not self.section.content[-1] and
>> > +                line and not line[0].isspace()):
>> 
>> PEP8 recommends breaking lines before operators in new code.
>
> ok
>
>> 
>> > +            self._start_section()
>> 
>> Changed from v6's
>> 
>>                raise QAPIParseError(self.parser, "Invalid section
>>                indentation")
>> 
>> May well be an improvement (I'm certainly no fan of this error myself),
>> but it throws my review off the fast track: now I have to figure out how
>> the new code works.  Can you give me a hint on what this change does?
>
> In v6, we rejected unindented sections after arguments (see 
> tests/qapi-schema/doc-bad-section.json):
>
> # @arg: blah bla
> #       zing bar
> #
> # This should be rejected
>
> And a patch in v6 reordered the json comments. But you asked for the
> doc to not be reordered without careful review (both source and
> output). So those additional paragraphs are added as untitled
> sections, and the doc generator keep section order. The output will be
> in the same order as the source.

Thanks.

>> Or should we take a shortcut and revert to v6 here?  We can always
>> improve on top.
>
> Probably not needed to revert.

I played with it a bit, and it looks like it's working fine.

>> > +        if (in_arg or not self.section.name or
>> > +                not self.section.name.startswith("Example")):
>> 
>> Again, break before the operator.
>
> ok
>
>> 
>> > +            line = line.strip()
>> > +        self.section.append(line)
>> > +
>> > +
>> >  class QAPISchemaParser(object):
>> >  
>> >      def __init__(self, fp, previously_included=[], incl_info=None):
>> > @@ -140,11 +256,17 @@ class QAPISchemaParser(object):
>> >          self.line = 1
>> >          self.line_pos = 0
>> >          self.exprs = []
>> > +        self.docs = []
>> >          self.accept()
>> >  
>> >          while self.tok is not None:
>> >              info = {'file': fname, 'line': self.line,
>> >                      'parent': self.incl_info}
>> > +            if self.tok == '#':
>> > +                doc = self.get_doc(info)
>> > +                self.docs.append(doc)
>> > +                continue
>> > +
>> >              expr = self.get_expr(False)
>> >              if isinstance(expr, dict) and "include" in expr:
>> >                  if len(expr) != 1:
>> > @@ -162,6 +284,7 @@ class QAPISchemaParser(object):
>> >                          raise QAPISemError(info, "Inclusion loop for %s"
>> >                                             % include)
>> >                      inf = inf['parent']
>> > +
>> >                  # skip multiple include of the same file
>> >                  if incl_abs_fname in previously_included:
>> >                      continue
>> > @@ -172,12 +295,19 @@ class QAPISchemaParser(object):
>> >                  exprs_include = QAPISchemaParser(fobj,
>> >                  previously_included,
>> >                                                   info)
>> >                  self.exprs.extend(exprs_include.exprs)
>> > +                self.docs.extend(exprs_include.docs)
>> >              else:
>> >                  expr_elem = {'expr': expr,
>> >                               'info': info}
>> > +                if (self.docs and
>> > +                        self.docs[-1].info['file'] == fname and
>> > +                        not self.docs[-1].expr):
>> 
>> Again, break before the operator.
>
> ok
>
>> 
>> > +                    self.docs[-1].expr = expr
>> > +                    expr_elem['doc'] = self.docs[-1]
>> > +
>> >                  self.exprs.append(expr_elem)
>> >  
>> > -    def accept(self):
>> > +    def accept(self, skip_comment=True):
>> >          while True:
>> >              self.tok = self.src[self.cursor]
>> >              self.pos = self.cursor
>> > @@ -185,7 +315,13 @@ class QAPISchemaParser(object):
>> >              self.val = None
>> >  
>> >              if self.tok == '#':
>> > +                if self.src[self.cursor] == '#':
>> > +                    # Start of doc comment
>> > +                    skip_comment = False
>> >                  self.cursor = self.src.find('\n', self.cursor)
>> > +                if not skip_comment:
>> > +                    self.val = self.src[self.pos:self.cursor]
>> > +                    return
>> >              elif self.tok in "{}:,[]":
>> >                  return
>> >              elif self.tok == "'":
>> 
>> Copied from review of v3, so I don't forget:
>> 
>> Comment tokens are thrown away as before, except when the parser asks
>> for them by passing skip_comment=False, or when the comment token starts
>> with ##.  The parser asks while parsing a doc comment, in get_doc().
>> 
>> This is a backchannel from the parser to the lexer.  I'd rather avoid
>> such lexer hacks, but I guess we can address that on top.
>> 
>> A comment starting with ## inside an expression is now a syntax error.
>> For instance, input
>> 
>>     {
>>     ##
>> 
>> yields
>> 
>>     /dev/stdin:2:1: Expected string or "}"
>> 
>> Rather unfriendly error message, but we can fix that on top.
>> 
>> > @@ -319,6 +455,28 @@ class QAPISchemaParser(object):
>> >              raise QAPIParseError(self, 'Expected "{", "[" or string')
>> >          return expr
>> >  
>> > +    def get_doc(self, info):
>> > +        if self.val != '##':
>> > +            raise QAPIParseError(self, "Junk after '##' at start of "
>> > +                                 "documentation comment")
>> > +
>> > +        doc = QAPIDoc(self, info)
>> > +        self.accept(False)
>> > +        while self.tok == '#':
>> > +            if self.val.startswith('##'):
>> > +                # End of doc comment
>> > +                if self.val != '##':
>> > +                    raise QAPIParseError(self, "Junk after '##' at end of
>> > "
>> > +                                         "documentation comment")
>> > +                self.accept()
>> > +                return doc
>> > +            else:
>> > +                doc.append(self.val)
>> > +            self.accept(False)
>> > +
>> > +        raise QAPIParseError(self, "Documentation comment must end with
>> > '##'")
>> > +
>> > +
>> >  #
>> >  # Semantic analysis of schema expressions
>> >  # TODO fold into QAPISchema
>> > @@ -703,6 +861,11 @@ def check_exprs(exprs):
>> >      for expr_elem in exprs:
>> >          expr = expr_elem['expr']
>> >          info = expr_elem['info']
>> > +
>> > +        if 'doc' not in expr_elem:
>> > +            raise QAPISemError(info,
>> > +                               "Expression missing documentation comment")
>> > +
>> >          if 'enum' in expr:
>> >              check_keys(expr_elem, 'enum', ['data'], ['prefix'])
>> >              add_enum(expr['enum'], info, expr['data'])
>> > @@ -761,6 +924,84 @@ def check_exprs(exprs):
>> >      return exprs
>> >  
>> >  
>> > +def check_freeform_doc(doc):
>> > +    if doc.symbol:
>> > +        raise QAPISemError(doc.info,
>> > +                           "Documention for '%s' is not followed"
>> > +                           " by the definition" % doc.symbol)
>> > +
>> > +    body = str(doc.body)
>> > +    if re.search(r'@\S+:', body, re.MULTILINE):
>> > +        raise QAPISemError(doc.info,
>> > +                           "Free-form documentation block must not
>> > contain"
>> > +                           " @NAME: sections")
>> > +
>> > +
>> > +def check_definition_doc(doc, expr, info):
>> 
>> Lots of churn in this function.  It can certainly use improvement, as
>> pointed out in prior reviews, but it triggers still more review.  Based
>> on the tests, I guess it's for finding documented parameters that don't
>> actually exist, and to diagnose optional mismatch.  Anything else?
>> 
>
> implicitely adds #optional (since it's no longer in type info, it better be 
> in the description), code simplification, message improvement too.
>
> I'd rather keep this too, and improve on top.

Okay.  Review below.

>> Or should we take a shortcut and revert to v6 here?  We can always
>> improve on top.  I think v6 would need a # TODO Should ensure #optional
>> matches the schema.
>> 
>> > +    for i in ('enum', 'union', 'alternate', 'struct', 'command', 'event'):
>> > +        if i in expr:
>> > +            meta = i
>> > +            break
>> > +
>> > +    name = expr[meta]
>> > +    if doc.symbol != name:
>> > +        raise QAPISemError(info, "Definition of '%s' follows
>> > documentation"
>> > +                           " for '%s'" % (name, doc.symbol))
>> > +    if doc.has_section('Returns') and 'command' not in expr:
>> > +        raise QAPISemError(info, "'Returns:' is only valid for commands")
>> 
>> Copied from review of v5, so I don't forget:
>> 
>> We accept 'Returns:' even when the command doesn't return anything,
>> because we currently use it to document errors, too.  Can't say I like
>> that, but it's out of scope here.
>> 
>> > +
>> > +    if meta == 'union':
>> > +        args = expr.get('base', [])

Note: args is either string or dictionary.

>> > +    else:
>> > +        args = expr.get('data', [])

Note: args is either string (only if command or event), dictionary (only
if struct, alternate, command, event) or list (only if enum)

>> > +    if isinstance(args, dict):
>> > +        args = args.keys()

Now args is either string (only if command or event) or list.

>> > +
>> > +    if meta == 'alternate' or \
>> > +       (meta == 'union' and not expr.get('discriminator')):

PEP8 prefers parenthesis over backslash, and recommends breaking lines
before instead of after binary operators:

          if (meta == 'alternate'
              or (meta == 'union' and not expr.get('discriminator'))):

Note: args can only be list here.

>> > +        args.append('type')

Works because we know it's a list.  It's less than obvious, though.

>> > +
>> > +    if isinstance(args, list):
>> > +        for arg in args:
>> > +            if arg[0] == '*':
>> > +                opt = True
>> > +                desc = doc.args.get(arg[1:])
>> > +            else:
>> > +                opt = False
>> > +                desc = doc.args.get(arg)
>> > +            if not desc:
>> > +                continue
>> > +            desc_opt = "#optional" in str(desc)
>> > +            if desc_opt and not opt:
>> > +                raise QAPISemError(info, "Description has #optional, "
>> > +                                   "but the declaration doesn't")
>> > +            if not desc_opt and opt:
>> > +                # silently fix the doc
>> > +                desc.append("#optional")

#optional is redundant information.  But as long as we use it, it better
be correct, to avoid misleading readers of the schema.  Two sides of
correctness: it's there when arg is optional, and it's not there when
it's not.  You enforce the latter, but you silently fix the former.  I
figure you do that because #optional is currently missing in quite a few
places.

Let's add
                      # TODO either fix the schema and make this an error,
                      # or drop #optional entirely

>> > +
>> > +    doc_args = set(doc.args.keys())
>> > +    args = set([name.strip('*') for name in args])

Now args morphs from string or dict to set.  I'd use a new variable.

If args is a string, it becomes a set of characters, which is not what
you want.  For instance, qapi-schema.json's QCryptoBlockOpenOptions has
'base': 'QCryptoBlockOptionsBase', and args changes from
'QCryptoBlockOptionsBase' to set(['O', 'a', 'C', 'B', 'e', 'i', 'c',
'k', 'l', 'o', 'n', 'Q', 'p', 's', 'r', 't', 'y']) here.

>> > +    if not doc_args.issubset(args):
>> > +        raise QAPISemError(info, "The following documented members are 
>> > not in "
>> > +                           "the declaration: %s" % ", ".join(doc_args - 
>> > args))

If args is originally a string, then we happily accept documentation for
its characters.  Quite unlikely to bite in practice, but it's wrong
anyways.

I suspect you could save yourself some grief by returning early when
args is string, say:

    if meta == 'union':
        args = expr.get('base', [])
    else:
        args = expr.get('data', [])
    if isinstance(args, str):
        return
    if isinstance(args, dict):
        args = args.keys()
    assert isinstance(args, list)

    if (meta == 'alternate'
            or (meta == 'union' and not expr.get('discriminator'))):
        args.append('type')

    for arg in args:
        ...

>> Copied from review of v5, so I don't forget:
>> 
>> As explained in review of v3, this is only a subset of the real set of
>> members.  Computing the exact set is impractical when working with the
>> abstract syntax tree.  I believe we'll eventually have to rewrite this
>> code to work with the QAPISchemaEntity instead.
>> 
>> > +
>> > +
>> > +def check_docs(docs):
>> > +    for doc in docs:
>> > +        for section in doc.args.values() + doc.sections:
>> > +            content = str(section)
>> > +            if not content or content.isspace():
>> > +                raise QAPISemError(doc.info,
>> > +                                   "Empty doc section '%s'" %
>> > section.name)
>> > +
>> > +        if not doc.expr:
>> > +            check_freeform_doc(doc)
>> > +        else:
>> > +            check_definition_doc(doc, doc.expr, doc.info)
>> > +
>> > +    return docs
>> > +
>> > +
>> >  #
>> >  # Schema compiler frontend
>> >  #
>> > @@ -1229,7 +1470,9 @@ class QAPISchemaEvent(QAPISchemaEntity):
>> >  class QAPISchema(object):
>> >      def __init__(self, fname):
>> >          try:
>> > -            self.exprs = check_exprs(QAPISchemaParser(open(fname,
>> > "r")).exprs)
>> > +            parser = QAPISchemaParser(open(fname, "r"))
>> > +            self.exprs = check_exprs(parser.exprs)
>> > +            self.docs = check_docs(parser.docs)
>> >              self._entity_dict = {}
>> >              self._predefining = True
>> >              self._def_predefineds()
>> > diff --git a/scripts/qapi2texi.py b/scripts/qapi2texi.py
>> > new file mode 100755
>> > index 0000000000..436749ec7b
>> > --- /dev/null
>> > +++ b/scripts/qapi2texi.py
>> > @@ -0,0 +1,266 @@
>> > +#!/usr/bin/env python
>> > +# QAPI texi generator
>> > +#
>> > +# This work is licensed under the terms of the GNU LGPL, version 2+.
>> > +# See the COPYING file in the top-level directory.
>> > +"""This script produces the documentation of a qapi schema in texinfo
>> > format"""
>> > +import re
>> > +import sys
>> > +
>> > +import qapi
>> > +
>> > +COMMAND_FMT = """
>> > address@hidden {type} {{}} {name}
>> > +
>> > +{body}
>> > +
>> > address@hidden deftypefn
>> > +
>> > +""".format
>> > +
>> > +ENUM_FMT = """
>> > address@hidden Enum {name}
>> > +
>> > +{body}
>> > +
>> > address@hidden deftp
>> > +
>> > +""".format
>> > +
>> > +STRUCT_FMT = """
>> > address@hidden {{{type}}} {name}
>> > +
>> > +{body}
>> > +
>> > address@hidden deftp
>> > +
>> > +""".format
>> > +
>> > +EXAMPLE_FMT = """@example
>> > +{code}
>> > address@hidden example
>> > +""".format
>> > +
>> > +
>> > +def subst_strong(doc):
>> > +    """Replaces *foo* by @strong{foo}"""
>> > +    return re.sub(r'\*([^*\n]+)\*', r'@emph{\1}', doc)
>> > +
>> > +
>> > +def subst_emph(doc):
>> > +    """Replaces _foo_ by @emph{foo}"""
>> > +    return re.sub(r'\b_([^_\n]+)_\b', r' @emph{\1} ', doc)
>> > +
>> > +
>> > +def subst_vars(doc):
>> > +    """Replaces @var by @code{var}"""
>> > +    return re.sub(r'@([\w-]+)', r'@code{\1}', doc)
>> > +
>> > +
>> > +def subst_braces(doc):
>> > +    """Replaces {} with @{ @}"""
>> > +    return doc.replace("{", "@{").replace("}", "@}")
>> > +
>> > +
>> > +def texi_example(doc):
>> > +    """Format @example"""
>> > +    # TODO: Neglects to escape @ characters.
>> > +    # We should probably escape them in subst_braces(), and rename the
>> > +    # function to subst_special() or subs_texi_special().  If we do that,
>> > we
>> > +    # need to delay it until after subst_vars() in texi_format().
>> > +    doc = subst_braces(doc).strip('\n')
>> > +    return EXAMPLE_FMT(code=doc)
>> > +
>> > +
>> > +def texi_format(doc):
>> > +    """
>> > +    Format documentation
>> > +
>> > +    Lines starting with:
>> > +    - |: generates an @example
>> > +    - =: generates @section
>> > +    - ==: generates @subsection
>> > +    - 1. or 1): generates an @enumerate @item
>> > +    - */-: generates an @itemize list
>> > +    """
>> > +    lines = []
>> > +    doc = subst_braces(doc)
>> > +    doc = subst_vars(doc)
>> > +    doc = subst_emph(doc)
>> > +    doc = subst_strong(doc)
>> > +    inlist = ""
>> > +    lastempty = False
>> > +    for line in doc.split('\n'):
>> > +        empty = line == ""
>> > +
>> > +        # FIXME: Doing this in a single if / elif chain is
>> > +        # problematic.  For instance, a line without markup terminates
>> > +        # a list if it follows a blank line (reaches the final elif),
>> > +        # but a line with some *other* markup, such as a = title
>> > +        # doesn't.
>> > +        if line.startswith("| "):
>> > +            line = EXAMPLE_FMT(code=line[2:])
>> > +        elif line.startswith("= "):
>> > +            line = "@section " + line[2:]
>> > +        elif line.startswith("== "):
>> > +            line = "@subsection " + line[3:]
>> > +        elif re.match("^([0-9]*[.]) ", line):
>> 
>> [.] is an unusual way to say \.
>
> Right, forgot to change that when removing ) from v6.
>
>> 
>> Perhaps we should consistently use r"..." for strings used as regular
>> expressions.
>> 
>
> No idea, could be changed on top.

Yes.  But if you respin anyway, adding some r wouldn't hurt.

>> > +            if not inlist:
>> > +                lines.append("@enumerate")
>> > +                inlist = "enumerate"
>> > +            line = line[line.find(" ")+1:]
>> > +            lines.append("@item")
>> > +        elif re.match("^[*-] ", line):
>> > +            if not inlist:
>> > +                lines.append("@itemize %s" % {'*': "@bullet",
>> > +                                              '-': "@minus"}[line[0]])
>> > +                inlist = "itemize"
>> > +            lines.append("@item")
>> > +            line = line[2:]
>> > +        elif lastempty and inlist:
>> > +            lines.append("@end %s\n" % inlist)
>> > +            inlist = ""
>> > +
>> > +        lastempty = empty
>> > +        lines.append(line)
>> > +
>> > +    if inlist:
>> > +        lines.append("@end %s\n" % inlist)
>> > +    return "\n".join(lines)
>> > +
>> > +
>> > +def texi_body(doc):
>> > +    """
>> > +    Format the body of a symbol documentation:
>> > +    - main body
>> > +    - table of arguments
>> > +    - followed by "Returns/Notes/Since/Example" sections
>> > +    """
>> > +    body = texi_format(str(doc.body)) + "\n"
>> > +    if doc.args:
>> > +        body += "@table @asis\n"
>> > +        for arg, section in doc.args.iteritems():
>> > +            desc = str(section)
>> > +            opt = ''
>> > +            if "#optional" in desc:
>> > +                desc = desc.replace("#optional", "")
>> > +                opt = ' (optional)'
>> > +            body += "@item @code{'%s'}%s\n%s\n" % (arg, opt,
>> > +                                                   texi_format(desc))
>> > +        body += "@end table\n"
>> > +
>> > +    for section in doc.sections:
>> > +        name, doc = (section.name, str(section))
>> > +        func = texi_format
>> > +        if name.startswith("Example"):
>> > +            func = texi_example
>> > +
>> > +        if name:
>> 
>> @quotation produces confusing .txt and .html output, as discussed in
>> review of v6.  Either drop it, or record the problem in a comment, e.g.
>> 
>>                # FIXME the indentation produced by @quotation in .txt and
>>                # .html output is confusing
>> 
>
> added

Thanks.

>> > +            body += "address@hidden address@hidden quotation" % \
>> > +                    (name, func(doc))
>> > +        else:
>> > +            body += func(doc)
>> > +
>> > +    return body
>> > +
>> > +
>> > +def texi_alternate(expr, doc):
>> > +    """Format an alternate to texi"""
>> > +    body = texi_body(doc)
>> > +    return STRUCT_FMT(type="Alternate",
>> > +                      name=doc.symbol,
>> > +                      body=body)
>> > +
>> > +
>> > +def texi_union(expr, doc):
>> > +    """Format a union to texi"""
>> > +    discriminator = expr.get("discriminator")
>> > +    if discriminator:
>> > +        union = "Flat Union"
>> > +    else:
>> > +        union = "Simple Union"
>> 
>> Not sure flat vs. simple matters for users of generated documentation,
>> but let's not worry about that now.
>> 
>> > +
>> > +    body = texi_body(doc)
>> > +    return STRUCT_FMT(type=union,
>> > +                      name=doc.symbol,
>> > +                      body=body)
>> > +
>> > +
>> > +def texi_enum(expr, doc):
>> > +    """Format an enum to texi"""
>> > +    for i in expr['data']:
>> > +        if i not in doc.args:
>> > +            doc.args[i] = ''
>> > +    body = texi_body(doc)
>> > +    return ENUM_FMT(name=doc.symbol,
>> > +                    body=body)
>> > +
>> > +
>> > +def texi_struct(expr, doc):
>> > +    """Format a struct to texi"""
>> > +    body = texi_body(doc)
>> > +    return STRUCT_FMT(type="Struct",
>> > +                      name=doc.symbol,
>> > +                      body=body)
>> > +
>> > +
>> > +def texi_command(expr, doc):
>> > +    """Format a command to texi"""
>> > +    body = texi_body(doc)
>> > +    return COMMAND_FMT(type="Command",
>> > +                       name=doc.symbol,
>> > +                       body=body)
>> > +
>> > +
>> > +def texi_event(expr, doc):
>> > +    """Format an event to texi"""
>> > +    body = texi_body(doc)
>> > +    return COMMAND_FMT(type="Event",
>> > +                       name=doc.symbol,
>> > +                       body=body)
>> > +
>> > +
>> > +def texi_expr(expr, doc):
>> > +    """Format an expr to texi"""
>> > +    (kind, _) = expr.items()[0]
>> > +
>> > +    fmt = {"command": texi_command,
>> > +           "struct": texi_struct,
>> > +           "enum": texi_enum,
>> > +           "union": texi_union,
>> > +           "alternate": texi_alternate,
>> > +           "event": texi_event}[kind]
>> > +
>> > +    return fmt(expr, doc)
>> > +
>> > +
>> > +def texi(docs):
>> > +    """Convert QAPI schema expressions to texi documentation"""
>> > +    res = []
>> > +    for doc in docs:
>> > +        expr = doc.expr
>> > +        if not expr:
>> > +            res.append(texi_body(doc))
>> > +            continue
>> > +        try:
>> > +            doc = texi_expr(expr, doc)
>> > +            res.append(doc)
>> > +        except:
>> > +            print >>sys.stderr, "error at @%s" % doc.info
>> > +            raise
>> > +
>> > +    return '\n'.join(res)
>> > +
>> > +
>> > +def main(argv):
>> > +    """Takes schema argument, prints result to stdout"""
>> > +    if len(argv) != 2:
>> > +        print >>sys.stderr, "%s: need exactly 1 argument: SCHEMA" %
>> > argv[0]
>> > +        sys.exit(1)
>> > +
>> > +    schema = qapi.QAPISchema(argv[1])
>> > +    print texi(schema.docs)
>> > +
>> > +
>> > +if __name__ == "__main__":
>> > +    main(sys.argv)
>> > diff --git a/docs/qapi-code-gen.txt b/docs/qapi-code-gen.txt
>> > index 2841c5144a..b29f996fdc 100644
>> > --- a/docs/qapi-code-gen.txt
>> > +++ b/docs/qapi-code-gen.txt
>> > @@ -44,40 +44,148 @@ Input must be ASCII (although QMP supports full
>> > Unicode strings, the
>> >  QAPI parser does not).  At present, there is no place where a QAPI
>> >  schema requires the use of JSON numbers or null.
>> >  
>> > +
>> > +=== Comments ===
>> > +
>> >  Comments are allowed; anything between an unquoted # and the following
>> > -newline is ignored.  Although there is not yet a documentation
>> > -generator, a form of stylized comments has developed for consistently
>> > -documenting details about an expression and when it was added to the
>> > -schema.  The documentation is delimited between two lines of ##, then
>> > -the first line names the expression, an optional overview is provided,
>> > -then individual documentation about each member of 'data' is provided,
>> > -and finally, a 'Since: x.y.z' tag lists the release that introduced
>> > -the expression.  Optional members are tagged with the phrase
>> > -'#optional', often with their default value; and extensions added
>> > -after the expression was first released are also given a '(since
>> > -x.y.z)' comment.  For example:
>> > -
>> > -    ##
>> > -    # @BlockStats:
>> > -    #
>> > -    # Statistics of a virtual block device or a block backing device.
>> > -    #
>> > -    # @device: #optional If the stats are for a virtual block device, the
>> > name
>> > -    #          corresponding to the virtual block device.
>> > -    #
>> > -    # @stats:  A @BlockDeviceStats for the device.
>> > -    #
>> > -    # @parent: #optional This describes the file block device if it has
>> > one.
>> > -    #
>> > -    # @backing: #optional This describes the backing block device if it
>> > has one.
>> > -    #           (Since 2.0)
>> > -    #
>> > -    # Since: 0.14.0
>> > -    ##
>> > -    { 'struct': 'BlockStats',
>> > -      'data': {'*device': 'str', 'stats': 'BlockDeviceStats',
>> > -               '*parent': 'BlockStats',
>> > -               '*backing': 'BlockStats'} }
>> > +newline is ignored.
>> > +
>> > +A multi-line comment that starts and ends with a '##' line is a
>> > +documentation comment.  These are parsed by the documentation
>> > +generator, which recognizes certain markup detailed below.
>> > +
>> > +
>> > +==== Documentation markup ====
>> > +
>> > +Comment text starting with '=' is a section title:
>> > +
>> > +    # = Section title
>> > +
>> > +Double the '=' for a subsection title:
>> > +
>> > +    # == Subection title
>> > +
>> > +'|' denotes examples:
>> > +
>> > +    # | Text of the example, may span
>> > +    # | multiple lines
>> > +
>> > +'*' starts an itemized list:
>> > +
>> > +    # * First item, may span
>> > +    #   multiple lines
>> > +    # * Second item
>> > +
>> > +You can also use '-' instead of '*'.
>> > +
>> > +A decimal number followed by '.' starts a numbered list:
>> > +
>> > +    # 1. First item, may span
>> > +    # multiple lines

Should the second line be indented?

>> > +    # 2. Second item
>> > +
>> > +The actual number doesn't matter.  You could even use '*' instead of
>> > +'2.' for the second item.
>> > +
>> > +FIXME what exactly ends a list
>> 
>> Can you say offhand what ends a list?  If yes, can we resolve this FIXME
>> now rather than later?
>
> well, the code is too permissive, there is a FIXME there already.
>
> In general, I think we want to end a list after a blank line and a "non-item" 
> line. Is the wording ok?

Blank line is pretty restrictive, because list items consisting of
multiple paragraphs aren't possible then.

I think writing something like this is fairly natural:

    * First item, may span
      multiple lines.

      Even multiple paragaphs are okay.

    * Second item

      - This is a sublist
      - Second item of sublist

    * Third item

    This sentence is not part of the list.

We don't support sublists, and I'm not saying we should, I'm just
showing how they would fit in.

My point is that indentation is a natural way to delimit list items.

I don't expect you to implement this now.  All I want for this series is
reasonable documentation.  As far as I can tell, texi_format() ends the
list at a blank line, except when it doesn't (see its FIXME).

What about this: replace "FIXME what exactly ends a list" by

    Lists can't be nested.  Blank lines are currently not supported
    within lists.

and add to texi_format()'s FIXME something like

    Make sure to update section "Documentation markup" in
    docs/qapi-code-gen.txt when fixing this.

Okay?

>> > +
>> > +Additional whitespace between the initial '#' and the comment text is
>> > +permitted.
>> > +
>> > +*foo* and _foo_ are for strong and emphasis styles respectively (they
>> > +do not work over multiple lines). @foo is used to reference a name in
>> > +the schema.
>> > +
>> > +Example:
>> > +
>> > +##
>> > +# = Section
>> > +# == Subsection
>> > +#
>> > +# Some text foo with *strong* and _emphasis_
>> > +# 1. with a list
>> > +# 2. like that
>> > +#
>> > +# And some code:
>> > +# | $ echo foo
>> > +# | -> do this
>> > +# | <- get that
>> > +#
>> > +##
>> > +
>> > +
>> > +==== Expression documentation ====
>> > +
>> > +Each expression that isn't an include directive must be preceded by a
>> > +documentation block.  Such blocks are called expression documentation
>> > +blocks.
>> > +
>> > +The first line of the documentation names the expression, then the
>> > +documentation body is provided, then individual documentation about
>> > +each member of 'data' is provided. Finally, several tagged sections
>> > +can be added.
>> 
>> In my review of v6, I wrote:
>> 
>>     I'm afraid this is more aspiration than specification: the parser
>>     accepts these things in almost any order.
>> 
>>     Could be filed under "deficiencies" for now.
>> 
>> To file under "deficiencies", we need to add a FIXME or TODO either
>> here, or in the parser.  Assuming you haven't tightened the parser
>> meanwhile (hope you haven't; we need to control the churn).
>
> ok
>
>> 
>>     Member of 'data' won't remain accurate.  Consider:
>> 
>>         { 'union': 'GlusterServer',
>>           'base': { 'type': 'GlusterTransport' },
>>           'discriminator': 'type',
>>           'data': { 'unix': 'UnixSocketAddress',
>>                     'tcp': 'InetSocketAddress' } }
>> 
>>     Its doc comment currently doesn't contain argument sections.  It should
>>     eventually contain @type:, @unix: and @tcp:.  Only the latter two are
>>     members of 'data'.
>> 
>>     I should propose something better, but I'm getting perilously close to
>>     the christmas break already.  Later.
>> 
>> Also: passive voice is meh (not your idea; you adapted the existing
>> text).
>> 
>> What about:
>> 
>>    The documentation block consists of a first line naming the
>>    expression, an optional overview, a description of each argument (for
>>    commands and events) or member (for structs, unions and alternates),
>>    and optional tagged sections.
>> 
>> I still don't like the use of "section" for both the = things and the
>> tagged sections, but it's not important enough to spend time on it now.
>> 
>
> ok
>
>> > +
>> > +Optional members are tagged with the phrase '#optional', often with
>> 
>> If we adopt my text, we should say "arguments / members" here.
>> 
>
> ok
>
>> > +their default value; and extensions added after the expression was
>> > +first released are also given a '(since x.y.z)' comment.
>> > +
>> > +A tagged section starts with one of the following words:
>> > +"Note:"/"Notes:", "Since:", "Example"/"Examples", "Returns:", "TODO:".
>> 
>> What ends such a section?  The start of the next one / end of the
>> comment block?
>> 
>
> yes
>
>> I suspect we could avoid PATCH 03 by recognizing "TODO" in addition to
>> "TODO:".  Might also be more robust.  But let's ignore this now.
>> 
>> > +
>> > +A 'Since: x.y.z' tag lists the release that introduced the expression.
>> 
>> Nitpick: 'Since: x.y.z' is a tagged section, not a tag.
>> 
>
> ok
>
>> > +
>> > +For example:
>> > +
>> > +##
>> > +# @BlockStats:
>> > +#
>> > +# Statistics of a virtual block device or a block backing device.
>> > +#
>> > +# @device: #optional If the stats are for a virtual block device, the name
>> > +#          corresponding to the virtual block device.
>> > +#
>> > +# @node-name: #optional The node name of the device. (since 2.3)
>> > +#
>> > +# ... more members ...
>> > +#
>> > +# Since: 0.14.0
>> > +##
>> > +{ 'struct': 'BlockStats',
>> > +  'data': {'*device': 'str', '*node-name': 'str',
>> > +           ... more members ... } }
>> > +
>> > +##
>> > +# @query-blockstats:
>> > +#
>> > +# Query the @BlockStats for all virtual block devices.
>> > +#
>> > +# @query-nodes: #optional If true, the command will query all the
>> > +#               block nodes ... explain, explain ...  (since 2.3)
>> > +#
>> > +# Returns: A list of @BlockStats for each virtual block devices.
>> > +#
>> > +# Since: 0.14.0
>> > +#
>> > +# Example:
>> > +#
>> > +# -> { "execute": "query-blockstats" }
>> > +# <- {
>> > +#      ... lots of output ...
>> > +#    }
>> > +#
>> > +##
>> > +{ 'command': 'query-blockstats',
>> > +  'data': { '*query-nodes': 'bool' },
>> > +  'returns': ['BlockStats'] }
>> > +
>> > +==== Free-form documentation ====
>> > +
>> > +A documentation block that isn't an expression documentation block is
>> > +a free-form documentation block.  These may be used to provide
>> > +additional text and structuring content.
>> > +
>> > +
>> > +=== Schema overview ===
>> >  
>> >  The schema sets up a series of types, as well as commands and events
>> >  that will use those types.  Forward references are allowed: the parser
>> [Tests look okay, snipped...]
>> 



reply via email to

[Prev in Thread] Current Thread [Next in Thread]