[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] [PATCH v5 13/17] qapi: add qapi2texi script
From: |
Marc-André Lureau |
Subject: |
Re: [Qemu-devel] [PATCH v5 13/17] qapi: add qapi2texi script |
Date: |
Tue, 06 Dec 2016 13:07:24 +0000 |
Hi
On Tue, Dec 6, 2016 at 2:50 PM Markus Armbruster <address@hidden> wrote:
> I had to resort to diff to find your replies, and massage the text
> manually to produce a readable reply myself. Please quote the usual
> way.
>
>
I'd have to switch to something else than gmail (which bothers me for
various reasons, let's not discuss the merits of various mail clients
please ;) In general, I don't have problems, but this mail is rather big,
sorry for the inconvenience..
> Markus Armbruster <address@hidden> writes:
> >
> > > Marc-André Lureau <address@hidden> writes:
> > >
> > >> As the name suggests, the qapi2texi script converts JSON QAPI
> > >> description into a texi file suitable for different target
> > >> formats (info/man/txt/pdf/html...).
> > >>
> > >> It parses the following kind of blocks:
> > >>
> > >> Free-form:
> > >>
> > >> ##
> > >> # = Section
> > >> # == Subsection
> > >> #
> > >> # Some text foo with *emphasis*
> > >> # 1. with a list
> > >> # 2. like that
> > >> #
> > >> # And some code:
> > >> # | $ echo foo
> > >> # | -> do this
> > >> # | <- get that
> > >> #
> > >> ##
> > >>
> > >> Symbol:
> > >>
> > >> ##
> > >> # @symbol:
> > >> #
> > >> # Symbol body ditto ergo sum. Foo bar
> > >> # baz ding.
> > >> #
> > >> # @arg: foo
> > >> # @arg: #optional foo
> > >
> > > Let's not use @arg twice.
> > >
> > > Terminology: I prefer to use "parameter" for formal parameters, and
> > > "argument" for actual arguments. This matches how The Python Language
> > > Reference uses the terms.
> > >
> > > What about
> > >
> > > # @param1: the frob to frobnicate
> > > # @param2: #optional how hard to frobnicate
> >
> > ok
> >
> > >> #
> > >> # Returns: returns bla bla
> > >> # Or bla blah
> > >
> > > Repeating "returns" is awkward, and we don't do that in our schemas.
> > >
> > > We need a period before "Or", or spell it "or".
> > >
> > > What about
> > >
> > > # Returns: the frobnicated frob.
> > > # If frob isn't frobnicatable, GenericError.
> >
> > ok
> >
> > >> #
> > >> # Since: version
> > >> # Notes: notes, comments can have
> > >> # - itemized list
> > >> # - like this
> > >> #
> > >> # Example:
> > >> #
> > >> # -> { "execute": "quit" }
> > >> # <- { "return": {} }
> > >> #
> > >> ##
> > >>
> > >> That's roughly following the following EBNF grammar:
> > >>
> > >> api_comment = "##\n" comment "##\n"
> > >> comment = freeform_comment | symbol_comment
> > >> freeform_comment = { "# " text "\n" | "#\n" }
> > >> symbol_comment = "# @" name ":\n" { member | meta | freeform_comment }
> > >
> > > Rejects non-empty comments where "#" is not followed by space. Such
> > > usage is accepted outside doc comments. Hmm.
> > >
> > >> member = "# @" name ':' [ text ] freeform_comment
> > >
> > > Are you missing a "\n" before freeform_comment?
> >
> > yes
> >
> > >> meta = "# " ( "Returns:", "Since:", "Note:", "Notes:", "Example:",
> "Examples:" ) [ text ] freeform_comment
> > >
> > > Likewise.
> >
> > ok
> >
> > >> text = free-text markdown-like, "#optional" for members
> > >
> > > The grammar is ambiguous: a line "# @foo:\n" can be parsed both as
> > > freeform_comment and as symbol_comment. Since your intent is obvious
> > > enough, it can still serve as documentation. It's not a suitable
> > > foundation for parsing, though. Okay for a commit message.
> > >
> > >> Thanks to the following json expressions, the documentation is
> enhanced
> > >> with extra information about the type of arguments and return value
> > >> expected.
> > >
> > > I guess you want to say that we enrich the documentation we extract
> from
> > > comments with information from the actual schema. Correct?
> >
> > yes
> >
> > > Missing: a brief discussion of deficiencies. These include:
> > >
> > > * The generated QMP documentation includes internal types
> > >
> > > We use qapi-schema.json both for defining the external QMP interface
> > > and for defining internal types. qmp-introspect.py carefully
> > > separates the two, to not expose internal types. qapi2texi.py
> happily
> > > exposes everything.
> > >
> > > Currently, about a fifth of the types in the generated docs are
> > > internal:
> > >
> > > AcpiTableOptions
> > > BiosAtaTranslation
> > > BlockDeviceMapEntry
> > > COLOMessage
> > > COLOMode
> > > DummyForceArrays
> > > FailoverStatus
> > > FloppyDriveType
> > > ImageCheck
> > > LostTickPolicy
> > > MapEntry
> > > MigrationParameter
> > > NetClientDriver
> > > NetFilterDirection
> > > NetLegacy
> > > NetLegacyNicOptions
> > > NetLegacyOptions
> > > NetLegacyOptionsKind
> > > Netdev
> > > NetdevBridgeOptions
> > > NetdevDumpOptions
> > > NetdevHubPortOptions
> > > NetdevL2TPv3Options
> > > NetdevNetmapOptions
> > > NetdevNoneOptions
> > > NetdevSocketOptions
> > > NetdevTapOptions
> > > NetdevUserOptions
> > > NetdevVdeOptions
> > > NetdevVhostUserOptions
> > > NumaNodeOptions
> > > NumaOptions
> > > NumaOptionsKind
> > > OnOffAuto
> > > OnOffSplit
> > > PreallocMode
> > > QCryptoBlockCreateOptions
> > > QCryptoBlockCreateOptionsLUKS
> > > QCryptoBlockFormat
> > > QCryptoBlockInfo
> > > QCryptoBlockInfoBase
> > > QCryptoBlockInfoQCow
> > > QCryptoBlockOpenOptions
> > > QCryptoBlockOptionsBase
> > > QCryptoBlockOptionsLUKS
> > > QCryptoBlockOptionsQCow
> > > QCryptoSecretFormat
> > > QCryptoTLSCredsEndpoint
> > > QapiErrorClass
> > > ReplayMode
> > > X86CPUFeatureWordInfo
> > > X86CPURegister32
> > >
> > > Generating documentation for internal types might be useful, but
> > > letting them pollute QMP interface documentation isn't. Needs fixing
> > > before we release. Until then, needs a FIXME comment in
> qapi2texi.py.
> > >
> > > * Union support is lacking
> > >
> > > The doc string language is adequate for documenting commands, events,
> > > and non-union types. It doesn't really handle union variants.
> Hardly
> > > surprising, as you fitted the language do existing practice, and
> > > existing (mal-)practice is neglecting to document union variant
> > > members.
> > >
> > > * Documentation is lacking
> > >
> > > See review of qapi-code-gen.txt below.
> > >
> > > * Doc comment error message positions are imprecise
> > >
> > > They always point to the beginning of the comment.
> > >
> > > * Probably more
> > >
> > > We should update this with noteworthy findings during review. I
> > > tried, but I suspect I missed some.
> >
> > ok
> >
> > >> Signed-off-by: Marc-André Lureau <address@hidden>
> > > [Lengthy diffstat snipped...]
> > >>
> > >> diff --git a/tests/Makefile.include b/tests/Makefile.include
> > >> index e98d3b6..f16764c 100644
> > >> --- a/tests/Makefile.include
> > >> +++ b/tests/Makefile.include
> > >> @@ -350,6 +350,21 @@ qapi-schema += base-cycle-direct.json
> > >> qapi-schema += base-cycle-indirect.json
> > >> qapi-schema += command-int.json
> > >> qapi-schema += comments.json
> > >> +qapi-schema += doc-bad-args.json
> > >> +qapi-schema += doc-bad-symbol.json
> > >> +qapi-schema += doc-duplicated-arg.json
> > >> +qapi-schema += doc-duplicated-return.json
> > >> +qapi-schema += doc-duplicated-since.json
> > >> +qapi-schema += doc-empty-arg.json
> > >> +qapi-schema += doc-empty-section.json
> > >> +qapi-schema += doc-empty-symbol.json
> > >> +qapi-schema += doc-invalid-end.json
> > >> +qapi-schema += doc-invalid-end2.json
> > >> +qapi-schema += doc-invalid-return.json
> > >> +qapi-schema += doc-invalid-section.json
> > >> +qapi-schema += doc-invalid-start.json
> > >> +qapi-schema += doc-missing-expr.json
> > >> +qapi-schema += doc-missing-space.json
> > >> qapi-schema += double-data.json
> > >> qapi-schema += double-type.json
> > >> qapi-schema += duplicate-key.json
> > >> @@ -443,6 +458,8 @@ qapi-schema += union-optional-branch.json
> > >> qapi-schema += union-unknown.json
> > >> qapi-schema += unknown-escape.json
> > >> qapi-schema += unknown-expr-key.json
> > >> +
> > >> +
> > >> check-qapi-schema-y := $(addprefix tests/qapi-schema/,
> $(qapi-schema))
> > >>
> > >> GENERATED_HEADERS += tests/test-qapi-types.h tests/test-qapi-visit.h
> \
> > >> diff --git a/scripts/qapi.py b/scripts/qapi.py
> > >> index 4d1b0e4..1b456b4 100644
> > >> --- a/scripts/qapi.py
> > >> +++ b/scripts/qapi.py
> > >> @@ -122,6 +122,109 @@ class QAPILineError(Exception):
> > >> "%s:%d: %s" % (self.info['file'], self.info['line'],
> self.msg)
> > >>
> > >>
> > >> +class QAPIDoc(object):
> > >> + class Section(object):
> > >> + def __init__(self, name=""):
> > >
> > > name=None feels more natural than "" for an absent optional name.
> >
> > ok
> >
> > >> + # optional section name (argument/member or section name)
> > >> + self.name = name
> > >> + # the list of strings for this section
> > >
> > > Would "list of lines" be more accurate, or less?
> >
> > more accurate, ok
> >
> > >> + self.content = []
> > >> +
> > >> + def append(self, line):
> > >> + self.content.append(line)
> > >> +
> > >> + def __repr__(self):
> > >> + return "\n".join(self.content).strip()
> > >> +
> > >> + class ArgSection(Section):
> > >> + pass
> > >
> > > Begs the question what makes ArgSection differ from Section. I think
> > > it's the semantics of self.name: an ArgSection's name is the parameter
> > > name, a non-ArgSection can either be a "meta" section (name is one of
> > > "Returns"", "Since", "Note", "Notes", "Example", "Examples") or an
> > > anonymous section (name is "").
> >
> > yes, and the section type is checked in _append_freeform()
> >
> > >> +
> > >> + def __init__(self, parser):
> > >> + self.parser = parser
> > >> + self.symbol = None
> > >> + self.body = QAPIDoc.Section()
> > >> + # a dict {'arg': ArgSection, ...}
> > >
> > > Works. Alternatevely:
> > >
> > > # dict mapping parameter name to ArgSection
> >
> > ok
> >
> > >> + self.args = OrderedDict()
> > >> + # a list of Section
> > >> + self.meta = []
> > >> + # the current section
> > >> + self.section = self.body
> > >> + # associated expression and info (set by expression parser)
> > >
> > > "will be set by expression parser", or "to be set by expression
> parser"?
> >
> > ok
> >
> > >> + self.expr = None
> > >> + self.info = None
> > >
> > > Clearer now. The one remark I still have is on the use of "meta" isn't
> > > obvious here. I think a comment further up explaining the different
> > > kinds of sections would help.
> > >
> > > To be honest, I don't get what's "meta" about the "meta" sections :)
> >
> > Let's drop that distinction, it isn't necessary anymore
> >
> > >> +
> > >> + def has_meta(self, name):
> > >> + """Returns True if the doc has a meta section 'name'"""
> > >
> > > PEP257[1] demands imperative mood, and punctuation:
> > >
> > > """Return True if the doc has a meta section 'name'."""
> > >
> > > Unfortunately, there seems to be no established convention for marking
> > > parameter names. You put this one in single quotes, which works here,
> > > but could be confusing in other cases. I'd sidestep it:
> > >
> > > """Return True if we have a meta section with this name."""
> > >
> > > or
> > >
> > > """Does a meta section with this name exist?"""
> >
> > ok
> >
> > >> + for i in self.meta:
> > >> + if i.name == name:
> > >> + return True
> > >> + return False
> > >> +
> > >> + def append(self, line):
> > >> + """Adds a # comment line, to be parsed and added to current
> section"""
> > >
> > > Imperative mood:
> > >
> > > """Add a comment line, to be parsed and added to the current
> section."""
> > >
> > > However, we're not always adding to the current section, we can also
> > > start a new one. The following avoids suggesting anything about the
> > > current section:
> > >
> > > """Parse a comment line and add it to the documentation."""
> > >
> > > When the function name is a verb, and its doc string starts with a
> > > different verb, the name might be suboptimal. Use your judgement.
> > >
> > > If this one-liner is too terse, we can try a multi-line doc string:
> > >
> > > """Parse a comment line and add it to the documentation.
> > >
> > > TODO should we tell more about how we parse?
> > >
> > > Args:
> > > line: the comment starting with '#', with the newline stripped.
> > >
> > > Raises:
> > > QAPISchemaError: TODO explain error conditions
> > > """
> > >
> > > Format stolen from the Google Python Style Guide[2], because PEP257 is
> > > of not much help there.
> > >
> > > Once you start with such elaborate doc strings, pressure will likely
> > > mount to do the same elsewhere. Could be a distraction right now. Use
> > > your judgement.
> >
> > ok, thanks
> >
> > >> + line = line[1:]
> > >> + if not line:
> > >> + self._append_freeform(line)
> > >> + return
> > >> +
> > >> + if line[0] != ' ':
> > >> + raise QAPISchemaError(self.parser, "missing space after
> #")
> > >> + line = line[1:]
> > >
> > > QAPISchemaError takes the error position from its QAPISchemaParser
> > > argument. In other words, the error position depends on the state of
> > > the parser when it calls this function. Turns out the position is the
> > > beginning of the comment. Servicable here. Action at a distance,
> > > though.
> > >
> > > Less strict:
> > >
> > > # strip leading # and space
> > > line = line[1:]
> > > if line[0] == ' ':
> > > line = line[1:]
> > >
> > > Also avoids special-casing empty comments.
> >
> > That would raise "IndexError: string index out of range" on empty comment
> > lines ("#")
>
> Fixable:
> if line.startswith(' '):
>
> > Also I'd like to keep the error in case a space is missing (it caught
> some
> > already).
>
> On the one hand, I share your distaste for lack of space between # and
> comment text. On the other hand, I dislike enforcing spacing
> inconsistently: enforce in doc comments, but not in other comments.
>
>
It's only enforced for doc comments, which I think is fair.
> > > Perhaps we should treat all leading whitespace the same (I'm not sure):
> > >
> > > # strip leading # and whitespace
> > > line = line[1:]
> > > line = line.lstrip()
> >
> > That would break indentation in Example sections
>
> True; scratch the idea.
>
> > >> +
> > >> + if self.symbol:
> > >> + self._append_symbol_line(line)
> > >> + elif (not self.body.content and
> > >> + line.startswith("@") and line.endswith(":")):
> > >> + self.symbol = line[1:-1]
> > >> + if not self.symbol:
> > >> + raise QAPISchemaError(self.parser, "Invalid symbol")
> > >
> > > Doesn't recognize the symbol when there's anything other than a single
> > > space between # and @. Pathological case: 'address@hidden:' starting in
> column
> > > 6 looks exactly like '# @foo:', but doesn't work. Fortunately, your
> > > code rejects the tab there. Assigning meaning to (leading) whitespace
> > > may make sense, but it must be documented clearly.
> >
> > I think it's simpler to only accept a simple space (# + space + @ +
> symbol
> > + :), it will also lead to more uniform doc.
>
> Simpler is always an argument worth considering, but simpler isn't
> always better. Experience with other languages has taught me to be
> carefule when making the kind or amount of whitespace significant.
>
> Kind of whitespace is not a usability issue with your proposal, because
> your patch rejects anything but space.
>
> Amount of whitespace is, because getting it wrong turns a symbol comment
> block into a free-form comment block, or a parameter section into
> free-form lines belonging to whatever section precedes it.
>
> As long as we reject definitions without a symbol comment, errors of the
> former kind will still be caught, although the error message could be
> confusing.
>
> As is, we don't reject definitions whose symbol comment fails to
> document all parameters. Errors of the latter kind will therefore go
> undetected. Trap for the unwary. See also "@param2 is True" below.
>
> If we can't find a solution without such usability issues now, then I'm
> willing to settle for documentation warning unwary users, or even for a
> FIXME comment.
>
ok
>
> > > Doesn't recognize the symbol when there's crap after ":". Confusing
> > > when it's invisible crap, i.e. trailing whitespace. In general, it's
> > > best to parse left to right, like this:
> > >
> > > elif not self.body.content and line.startswith('@'):
> > > match = valid_name.match(line[1:]);
> > > if not match:
> > > raise QAPISchemaError(self.parser, 'Invalid name')
> > > if line.startswith(':', match.end(0)):
> > > raise QAPISchemaError(self.parser, 'Expected ":")
> >
> > Easier is simply to check :
> > if not line.endswith(":"):
> > raise QAPIParseError(self.parser, "Line should end with
> :")
> >
> > test added
>
> There's a reason we have a mountain of theory on parsing, built in
> decades of practice. What starts "easier" than that generally ends up
> rewritten from scratch. Anyway, let's continue.
>
> > > Here, the error position is clearly suboptimal: it's the beginning of
> > > the comment instead of where the actual error is, namely the beginning
> /
> > > the end of the name. More of the same below. The root cause is
> > > stacking parsers. I'm not asking you to do anything about it at this
> > > time.
> > >
> > > Note my use of "name" rather than "symbol" in the error message.
> > > qapi-code-gen.txt talks about names, not symbols. Consistent use of
> > > terminology matters.
> >
> > Agree, isn't "symbol" more appropriate for documentation though?
>
> Possibly. But if it's an improvement for generated documentation, it
> surely is an improvement for hand-written documentation such as
> qapi-code-gen.txt and the comments in qapi*.py, too. And the Python
> identifiers. I have to admit that cools my enthusiasm for making the
> improvement right now :)
>
> > >> + else:
> > >> + self._append_freeform(line)
> > >> +
> > >> + def _append_symbol_line(self, line):
> > >> + name = line.split(' ', 1)[0]
> > >> +
> > >> + if name.startswith("@") and name.endswith(":"):
> > >> + line = line[len(name)+1:]
> > >> + self._start_args_section(name[1:-1])
> > >
> > > Similar issue. Left to right:
> > >
> > > if re.match(r'\s*@', line):
> > > match = valid_name.match(line[1:]);
> > > if not match:
> > > raise QAPISchemaError(self.parser, 'Invalid
> parameter name')
> > > line = line[match.end(0):]
> > > if line.startswith(':'):
> > > raise QAPISchemaError(self.parser, 'Expected ":")
> > > line = line[1:]
> > > self._start_args_section(match.group(0))
> > >
> > > Even better would be factoring out the common code to parse '@foo:'.
> >
> > It's a valid case to have a section freeform comment starting with @foo,
> ex:
> >
> > @param: do this and than if
> > @param2 is True
>
> Good point, but it's a language issue, not a parsing issue.
>
> The language issue is that we need to recognize a line "# @param: ..."
> as the start of a parameter section, while still permitting free-form
> lines like "# @param2 is True".
>
> This is a pretty convincing argument for requiring the ':'. It can also
> be used as an argument for requiring exactly one space between # and @.
>
> How the language is parsed is completely orthogonal.
>
> > I don't think it's worth to factor out the code to parse @foo vs @foo:,
> yet
> >
> > >> + elif name in ("Returns:", "Since:",
> > >> + # those are often singular or plural
> > >> + "Note:", "Notes:",
> > >> + "Example:", "Examples:"):
> > >> + line = line[len(name)+1:]
> > >> + self._start_meta_section(name[:-1])
> > >
> > > Since we're re.match()ing already, here's how do to it that way:
> >
> > I don't follow your suggestion for the reason above,so I still have the
> > first word 'name'
> >
> > > else
> > > match =
> re.match(r'\s*(Returns|Since|Notes?|Examples?):', line)
> > > if match:
> > > line = line[match.end(0):]
> > > self._start_meta_section(match.group(1))
> >
> > not really much nicer to me (especially because no match yet), I'll keep
> > the current code for now
> >
> > >> +
> > >> + self._append_freeform(line)
> > >> +
> > >> + def _start_args_section(self, name):
> > >> + if not name:
> > >> + raise QAPISchemaError(self.parser, "Invalid argument
> name")
> > >
> > > parameter name
> >
> > ok
> >
> > >> + if name in self.args:
> > >> + raise QAPISchemaError(self.parser, "'%s' arg duplicated"
> % name)
> > >
> > > Duplicate parameter name
> >
> > ok
> >
> > >> + self.section = QAPIDoc.ArgSection(name)
> > >> + self.args[name] = self.section
> > >> +
> > >> + def _start_meta_section(self, name):
> > >> + if name in ("Returns", "Since") and self.has_meta(name):
> > >> + raise QAPISchemaError(self.parser,
> > >> + "Duplicated '%s' section" % name)
> > >> + self.section = QAPIDoc.Section(name)
> > >> + self.meta.append(self.section)
> > >> +
> > >> + def _append_freeform(self, line):
> > >> + in_arg = isinstance(self.section, QAPIDoc.ArgSection)
> > >> + if in_arg and self.section.content and not
> self.section.content[-1] \
> > >> + and line and not line[0].isspace():
> > >
> > > PEP8[3] prefers parenthesises over backslash:
> > >
> > > if (in_arg and self.section.content and not
> self.section.content[-1]
> > > and line and not line[0].isspace()):
> >
> > yes, thanks
> >
> > >> + # an empty line followed by a non-indented
> > >> + # comment ends the argument section
> > >> + self.section = self.body
> > >> + self._append_freeform(line)
> > >> + return
> > >
> > > Switching back to the self.body section like this reorders the
> > > documentation text. I still think this is a terrible idea. A dumb
> > > script is exceedingly unlikely to improve human-written doc comment
> text
> > > by reordering. In the rare case it does, the doc comment source should
> > > be reordered.
> > >
> > > Here's an example where the doc generator happily creates
> unintelligible
> > > garbage if I format CpuDefinitionInfo's doc comment in a slightly off
> > > way:
> > >
> > > ##
> > > # @CpuDefinitionInfo:
> > > #
> > > # Virtual CPU definition.
> > > #
> > > # @name: the name of the CPU definition
> > > #
> > > # @migration-safe: #optional whether a CPU definition can be safely
> > > # used for migration in combination with a QEMU compatibility
> > > # machine when migrating between different QMU versions and between
> > > # hosts with different sets of (hardware or software)
> > > # capabilities.
> > > #
> > > # If not provided, information is not available and callers should
> > > # not assume the CPU definition to be migration-safe. (since 2.8)
> > > #
> > > # @static: whether a CPU definition is static and will not change
> > > # depending on QEMU version, machine type, machine options and
> > > # accelerator options. A static model is always
> > > # migration-safe. (since 2.8)
> > > #
> > > # @unavailable-features: #optional List of properties that prevent
> > > # the CPU model from running in the current host. (since 2.8)
> > > #
> > > # @unavailable-features is a list of QOM property names that
> > > # represent CPU model attributes that prevent the CPU from running.
> > > # If the QOM property is read-only, that means there's no known
> > > # way to make the CPU model run in the current host.
> Implementations
> > > # that choose not to provide specific information return the
> > > # property name "type".
> > > #
> > > # If the property is read-write, it means that it MAY be possible
> > > # to run the CPU model in the current host if that property is
> > > # changed. Management software can use it as hints to suggest or
> > > # choose an alternative for the user, or just to generate
> meaningful
> > > # error messages explaining why the CPU model can't be used.
> > > # If @unavailable-features is an empty list, the CPU model is
> > > # runnable using the current host and machine-type.
> > > # If @unavailable-features is not present, runnability
> > > # information for the CPU is not available.
> > > #
> > > # Since: 1.2.0
> > > ##
> > > { 'struct': 'CpuDefinitionInfo',
> > > 'data': { 'name': 'str', '*migration-safe': 'bool', 'static':
> 'bool',
> > > '*unavailable-features': [ 'str' ] } }
> > >
> > > To detect the problem, you have to read the generated docs attentively.
> > > You know my opinion on our chances for that to happen during
> > > development.
> > >
> > > My point is not that we should support this doc comment format. My
> > > point is that people could conceivably write something like it, and not
> > > get caught in patch review.
> > >
> > > I can see three ways out of this swamp:
> > >
> > > 1. Let sections continue until another one begins.
> >
> > but we have interleaved sections, and no explicit "body" section tag,
> ex:
> >
> > ##
> > # @input-send-event:
> > #
> > # Send input event(s) to guest.
> > #
> > # @device: #optional display device to send event(s) to.
> > # @head: #optional head to send event(s) to, in case the
> > # display device supports multiple scanouts.
> > # @events: List of InputEvent union.
> > #
> > # Returns: Nothing on success.
> > #
> > # The @device and @head parameters can be used to send the input event
> > # to specific input devices in case (a) multiple input devices of the
> > # same kind are added to the virtual machine and (b) you have
> > # configured input routing (see docs/multiseat.txt) for those input
> > # ....
> >
> > > 2. Make text between sections an error.
> > >
> > > 3. When a section ends, start a new anonymous section.
> >
> > It's not clear how to recognize when a section ends and append to comment
> > body.
> >
> > 2. Make text between sections an error.
> >
> > Sound like the best option to me. I'll fix the doc and add a check & test
> > for this common pattern though:
> >
> > ##
> > # @TestStruct:
> > #
> > # body with @var
> > #
> > # @integer: foo
> > # blah
> > #
> > # bao
> > #
> > # Let's catch this bad comment.
> > ##
>
> Go ahead.
>
> > > Can't say offhand which one will work best.
> > >
> > >> + if in_arg or not self.section.name.startswith("Example"):
> > >> + line = line.strip()
> > >
> > > Stripping whitespace is not "Markdown-like", because intendation
> carries
> > > meaning in Markdown. Example:
> > >
> > > * First item in itemized list
> > > * Second item
> > > * Sub-item of second item
> > > * Another sub-item
> > > * Third item
> > >
> > > Stripping whitespace destroys the list structure. If that's what you
> > > want, you get to document where your "Markdown-like" markup is unlike
> > > Markdown :)
> >
> > let's use "comment annotations" instead
>
> Okay.
>
> > > Is there a technical reason for stripping whitespace?
> > >
> > > See also discussion of space after # above.
> > >
> > >> + self.section.append(line)
> > >> +
> > >> +
> > >> class QAPISchemaParser(object):
> > >>
> > >> def __init__(self, fp, previously_included=[], incl_info=None):
> > >> @@ -137,11 +240,18 @@ class QAPISchemaParser(object):
> > >> self.line = 1
> > >> self.line_pos = 0
> > >> self.exprs = []
> > >> + self.docs = []
> > >> self.accept()
> > >>
> > >> while self.tok is not None:
> > >> info = {'file': fname, 'line': self.line,
> > >> 'parent': self.incl_info}
> > >> + if self.tok == '#' and self.val.startswith('##'):
> > >
> > > How can self.val *not* start with '##' here?
> >
> > you are right, unnecessary condition since we break get_expr() only in
> > this case now
> >
> > >> + doc = self.get_doc()
> > >> + doc.info = info
> > >
> > > Let's pass info as argument to get_doc(), so we don't have to dot into
> > > doc here. get_doc() can avoid dotting into doc by passing info to the
> > > constructor.
> >
> > ok
> >
> > >> + self.docs.append(doc)
> > >> + continue
> > >> +
> > >> expr = self.get_expr(False)
> > >> if isinstance(expr, dict) and "include" in expr:
> > >> if len(expr) != 1:
> > >> @@ -160,6 +270,7 @@ class QAPISchemaParser(object):
> > >> raise QAPILineError(info, "Inclusion loop
> for %s"
> > >> % include)
> > >> inf = inf['parent']
> > >> +
> > >> # skip multiple include of the same file
> > >> if incl_abs_fname in previously_included:
> > >> continue
> > >> @@ -171,12 +282,38 @@ class QAPISchemaParser(object):
> > >> exprs_include = QAPISchemaParser(fobj,
> previously_included,
> > >> info)
> > >> self.exprs.extend(exprs_include.exprs)
> > >> + self.docs.extend(exprs_include.docs)
> > >> else:
> > >> expr_elem = {'expr': expr,
> > >> 'info': info}
> > >> + if self.docs and not self.docs[-1].expr:
> > >> + self.docs[-1].expr = expr
> > >> + expr_elem['doc'] = self.docs[-1]
> > >> +
> > >
> > > Attaches the expression to the last doc comment that doesn't already
> > > have one. A bit sloppy, because there could be non-doc comments in
> > > between, or the doc comment could be in another file. It'll do, at
> > > least for now.
> >
> > I extended the condition to check it attaches the doc from the same file
> >
> > >> self.exprs.append(expr_elem)
> > >>
> > >> - def accept(self):
> > >> + def get_doc(self):
> > >> + if self.val != '##':
> > >> + raise QAPISchemaError(self, "Junk after '##' at start of
> "
> > >> + "documentation comment")
> > >> +
> > >> + doc = QAPIDoc(self)
> > >> + self.accept(False)
> > >> + while self.tok == '#':
> > >> + if self.val.startswith('##'):
> > >> + # End of doc comment
> > >> + if self.val != '##':
> > >> + raise QAPISchemaError(self, "Junk after '##' at
> end of "
> > >> + "documentation comment")
> > >> + self.accept()
> > >> + return doc
> > >> + else:
> > >> + doc.append(self.val)
> > >> + self.accept(False)
> > >> +
> > >> + raise QAPISchemaError(self, "Documentation comment must end
> with '##'")
> > >
> > > Let's put this after accept, next to the other get_FOO().
> >
> > ok
> >
> > >> +
> > >> + def accept(self, skip_comment=True):
> > >> while True:
> > >> self.tok = self.src[self.cursor]
> > >> self.pos = self.cursor
> > >> @@ -184,7 +321,13 @@ class QAPISchemaParser(object):
> > >> self.val = None
> > >>
> > >> if self.tok == '#':
> > >> + if self.src[self.cursor] == '#':
> > >> + # Start of doc comment
> > >> + skip_comment = False
> > >> self.cursor = self.src.find('\n', self.cursor)
> > >> + if not skip_comment:
> > >> + self.val = self.src[self.pos:self.cursor]
> > >> + return
> > >> elif self.tok in "{}:,[]":
> > >> return
> > >> elif self.tok == "'":
> > >
> > > Copied from review of v3, so I don't forget:
> > >
> > > Comment tokens are thrown away as before, except when the parser asks
> > > for them by passing skip_comment=False, or when the comment token
> starts
> > > with ##. The parser asks while parsing a doc comment, in get_doc().
> > >
> > > This is a backchannel from the parser to the lexer. I'd rather avoid
> > > such lexer hacks, but I guess we can address that on top.
> >
> > I don't have a good solution to that
>
> I have an idea which may or may not work. I can explore it on top.
>
>
thanks
> > > A comment starting with ## inside an expression is now a syntax error.
> > > For instance, input
> > >
> > > {
> > > ##
> > >
> > > yields
> > >
> > > /dev/stdin:2:1: Expected string or "}"
> > >
> > > Rather unfriendly error message, but we can fix that on top.
> > >
> > >> @@ -713,7 +856,7 @@ def check_keys(expr_elem, meta, required,
> optional=[]):
> > >> % (key, meta, name))
> > >>
> > >>
> > >> -def check_exprs(exprs):
> > >> +def check_exprs(exprs, strict_doc):
> > >
> > > Note: strict_doc=False unless this is qapi2texi.py.
> >
> > For tests reasons: we may want to fix the test instead, or have a flag
> > QAPI_CHECK/NOSTRICT_DOC=1 or an option.
> >
> > >> global all_names
> > >>
> > >> # Learn the types and check for valid expression keys
> > >> @@ -722,6 +865,11 @@ def check_exprs(exprs):
> > >> for expr_elem in exprs:
> > >> expr = expr_elem['expr']
> > >> info = expr_elem['info']
> > >> +
> > >> + if strict_doc and 'doc' not in expr_elem:
> > >> + raise QAPILineError(info,
> > >> + "Expression missing documentation
> comment")
> > >> +
> > >
> > > Why should we supress this error in programs other than qapi2texi.py?
> >
> > because the tests don't have comments
>
> Good point :)
>
> Can we come up with a brief comment explaining this?
>
> I fixed the tests instead (after all, we have complete control on what to
accept or not. that was boring but now it's done)
> > > Can't see a test for this one.
> >
> > because tests aren't strict :)
> >
> > I guess you'll tell me to fix the tests instead, so I did that in the
> next
> > series.
> >
> > >> if 'enum' in expr:
> > >> check_keys(expr_elem, 'enum', ['data'], ['prefix'])
> > >> add_enum(expr['enum'], info, expr['data'])
> > >> @@ -780,6 +928,63 @@ def check_exprs(exprs):
> > >> return exprs
> > >>
> > >>
> > >> +def check_simple_doc(doc):
> > >
> > > You call this "free-form" elsewhere. Pick one name and stick to it.
> > > I think free-form is more descriptive than simple.
> >
> > ok
> >
> > >> + if doc.symbol:
> > >> + raise QAPILineError(doc.info,
> > >> + "'%s' documention is not followed by the
> definition"
> > >> + % doc.symbol)
> > >
> > > "Documentation for %s is not ..."
> >
> > ok
> >
> > >> +
> > >> + body = str(doc.body)
> > >> + if re.search(r'@\S+:', body, re.MULTILINE):
> > >> + raise QAPILineError(doc.info,
> > >> + "Document body cannot contain @NAME:
> sections")
> > >> +
> > >> +
> > >> +def check_expr_doc(doc, expr, info):
> > >
> > > You call this "symbol" elsewhere. I think "definition" would be better
> > > than either.
> >
> > ok
> >
> > >> + for i in ('enum', 'union', 'alternate', 'struct', 'command',
> 'event'):
> > >> + if i in expr:
> > >> + meta = i
> > >> + break
> > >> +
> > >> + name = expr[meta]
> > >> + if doc.symbol != name:
> > >> + raise QAPILineError(info, "Definition of '%s' follows
> documentation"
> > >> + " for '%s'" % (name, doc.symbol))
> > >> + if doc.has_meta('Returns') and 'command' not in expr:
> > >> + raise QAPILineError(info, "Invalid return documentation")
> > >
> > > Suggest something like "'Returns:' is only valid for commands".
> > >
> > > We accept 'Returns:' even when the command doesn't return anything,
> > > because we currently use it to document errors, too. Can't say I like
> > > that, but it's out of scope here.
> >
> > ok
> >
> > >> +
> > >> + doc_args = set(doc.args.keys())
> > >> + if meta == 'union':
> > >> + data = expr.get('base', [])
> > >> + else:
> > >> + data = expr.get('data', [])
> > >> + if isinstance(data, dict):
> > >> + data = data.keys()
> > >> + args = set([name.strip('*') for name in data])
> > >
> > > Not fixed since v3:
> > >
> > > * Flat union where 'base' is a string, e.g. union UserDefFlatUnion in
> > > qapi-schema-test.json has base 'UserDefUnionBase', args is set(['a',
> > > 'B', 'e', 'D', 'f', 'i', 'o', 'n', 's', 'r', 'U'])
> > >
> > > * Command where 'data' is a string, e.g. user_def_cmd0 in
> > > qapi-schema-test.json has data 'Empty2', args is set(['E', 'm', 'p',
> > > '2', 't', 'y'])
> > >
> > > * Event where 'data' is a string, no test case handy (hole in test
> > > coverage)
> >
> > ok, I changed it that way, that fixes it:
> > + if isinstance(data, list):
> > + args = set([name.strip('*') for name in data])
> > + else:
> > + args = set()
>
> Uh, sure this does the right thing when data is a dict?
>
yes, the line above takes care of extracting the keys in data.
>
> > >> + if meta == 'alternate' or \
> > >> + (meta == 'union' and not expr.get('discriminator')):
> > >> + args.add('type')
> > >
> > > As explained in review of v3, this is only a subset of the real set of
> > > members. Computing the exact set is impractical when working with the
> > > abstract syntax tree. I believe we'll eventually have to rewrite this
> > > code to work with the QAPISchemaEntity instead.
> >
> > I don't think we want to list all the members as this would lead to
> > duplicated documentation. Instead it should document only the members of
> > the expr being defined.
>
> You're sticking to established practice, which makes lots of sense.
> However, established practice is a lot more clear for simple cases than
> for things like members inherited from base types, variant members and
> such. At some point, we'll have to figure out how we want not so simple
> cases documented. Until then, we can't really decide how to check
> documentation completeness.
>
> > In which case, it looks like this check is good
> > enough, no?
>
> For now, yes. Later on, maybe.
>
> Let's document the limitation in a comment and the commit message, and
> move on.
>
I'd appreciate your help to list the limitations in the commit message, as
I have not as thorough understanding as you, and even worse I often don't
use the same name for the various concepts.
>
> > >> + if not doc_args.issubset(args):
> > >> + raise QAPILineError(info, "Members documentation is not a
> subset of"
> > >> + " API %r > %r" % (list(doc_args),
> list(args)))
> > >> +
> > >> +
> > >> +def check_docs(docs):
> > >> + for doc in docs:
> > >> + for section in doc.args.values() + doc.meta:
> > >> + content = str(section)
> > >> + if not content or content.isspace():
> > >> + raise QAPILineError(doc.info,
> > >> + "Empty doc section '%s'" %
> section.name)
> > >> +
> > >> + if not doc.expr:
> > >> + check_simple_doc(doc)
> > >> + else:
> > >> + check_expr_doc(doc, doc.expr, doc.info)
> > >> +
> > >> + return docs
> > >> +
> > >> +
> > >> #
> > >> # Schema compiler frontend
> > >> #
> > >> @@ -1249,9 +1454,11 @@ class QAPISchemaEvent(QAPISchemaEntity):
> > >>
> > >>
> > >> class QAPISchema(object):
> > >> - def __init__(self, fname):
> > >> + def __init__(self, fname, strict_doc=False):
> > >> try:
> > >> - self.exprs = check_exprs(QAPISchemaParser(open(fname,
> "r")).exprs)
> > >> + parser = QAPISchemaParser(open(fname, "r"))
> > >> + self.exprs = check_exprs(parser.exprs, strict_doc)
> > >> + self.docs = check_docs(parser.docs)
> > >> self._entity_dict = {}
> > >> self._predefining = True
> > >> self._def_predefineds()
> > >> diff --git a/scripts/qapi2texi.py b/scripts/qapi2texi.py
> > >> new file mode 100755
> > >> index 0000000..0cec43a
> > >> --- /dev/null
> > >> +++ b/scripts/qapi2texi.py
> > >
> > > Still only skimming this one.
> > >
> > >> @@ -0,0 +1,331 @@
> > >> +#!/usr/bin/env python
> > >> +# QAPI texi generator
> > >> +#
> > >> +# This work is licensed under the terms of the GNU LGPL, version 2+.
> > >> +# See the COPYING file in the top-level directory.
> > >> +"""This script produces the documentation of a qapi schema in
> texinfo format"""
> > >> +import re
> > >> +import sys
> > >> +
> > >> +import qapi
> > >> +
> > >> +COMMAND_FMT = """
> > >> address@hidden {type} {{{ret}}} {name} @
> > >> +{{{args}}}
> > >> +
> > >> +{body}
> > >> +
> > >> address@hidden deftypefn
> > >> +
> > >> +""".format
> > >> +
> > >> +ENUM_FMT = """
> > >> address@hidden Enum {name}
> > >> +
> > >> +{body}
> > >> +
> > >> address@hidden deftp
> > >> +
> > >> +""".format
> > >> +
> > >> +STRUCT_FMT = """
> > >> address@hidden {{{type}}} {name} @
> > >> +{{{attrs}}}
> > >> +
> > >> +{body}
> > >> +
> > >> address@hidden deftp
> > >> +
> > >> +""".format
> > >> +
> > >> +EXAMPLE_FMT = """@example
> > >> +{code}
> > >> address@hidden example
> > >> +""".format
> > >> +
> > >> +
> > >> +def subst_strong(doc):
> > >> + """Replaces *foo* by @strong{foo}"""
> > >> + return re.sub(r'\*([^_\n]+)\*', r'@emph{\1}', doc)
> > >> +
> > >> +
> > >> +def subst_emph(doc):
> > >> + """Replaces _foo_ by @emph{foo}"""
> > >> + return re.sub(r'\s_([^_\n]+)_\s', r' @emph{\1} ', doc)
> > >> +
> > >> +
> > >> +def subst_vars(doc):
> > >> + """Replaces @var by @code{var}"""
> > >> + return re.sub(r'@([\w-]+)', r'@code{\1}', doc)
> > >> +
> > >> +
> > >> +def subst_braces(doc):
> > >> + """Replaces {} with @{ @}"""
> > >> + return doc.replace("{", "@{").replace("}", "@}")
> > >> +
> > >> +
> > >> +def texi_example(doc):
> > >> + """Format @example"""
> > >> + doc = subst_braces(doc).strip('\n')
> > >> + return EXAMPLE_FMT(code=doc)
> > >> +
> > >> +
> > >> +def texi_comment(doc):
> > >> + """
> > >> + Format a comment
> > >> +
> > >> + Lines starting with:
> > >> + - |: generates an @example
> > >> + - =: generates @section
> > >> + - ==: generates @subsection
> > >> + - 1. or 1): generates an @enumerate @item
> > >> + - o/*/-: generates an @itemize list
> > >> + """
> > >> + lines = []
> > >> + doc = subst_braces(doc)
> > >> + doc = subst_vars(doc)
> > >> + doc = subst_emph(doc)
> > >> + doc = subst_strong(doc)
> > >> + inlist = ""
> > >> + lastempty = False
> > >> + for line in doc.split('\n'):
> > >> + empty = line == ""
> > >> +
> > >> + if line.startswith("| "):
> > >> + line = EXAMPLE_FMT(code=line[2:])
> > >> + elif line.startswith("= "):
> > >> + line = "@section " + line[2:]
> > >> + elif line.startswith("== "):
> > >> + line = "@subsection " + line[3:]
> > >> + elif re.match("^([0-9]*[.)]) ", line):
> > >> + if not inlist:
> > >> + lines.append("@enumerate")
> > >> + inlist = "enumerate"
> > >> + line = line[line.find(" ")+1:]
> > >> + lines.append("@item")
> > >> + elif re.match("^[o*-] ", line):
> > >> + if not inlist:
> > >> + lines.append("@itemize %s" % {'o': "@bullet",
> > >> + '*': "@minus",
> > >> + '-': ""}[line[0]])
> > >> + inlist = "itemize"
> > >> + lines.append("@item")
> > >> + line = line[2:]
> > >> + elif lastempty and inlist:
> > >> + lines.append("@end %s\n" % inlist)
> > >> + inlist = ""
> > >> +
> > >> + lastempty = empty
> > >> + lines.append(line)
> > >> +
> > >> + if inlist:
> > >> + lines.append("@end %s\n" % inlist)
> > >> + return "\n".join(lines)
> > >> +
> > >> +
> > >> +def texi_args(expr, key="data"):
> > >> + """
> > >> + Format the functions/structure/events.. arguments/members
> > >> + """
> > >> + if key not in expr:
> > >> + return ""
> > >> +
> > >> + args = expr[key]
> > >> + arg_list = []
> > >> + if isinstance(args, str):
> > >> + arg_list.append(args)
> > >> + else:
> > >> + for name, typ in args.iteritems():
> > >> + # optional arg
> > >> + if name.startswith("*"):
> > >> + name = name[1:]
> > >> + arg_list.append("['%s': @var{%s}]" % (name, typ))
> > >> + # regular arg
> > >> + else:
> > >> + arg_list.append("'%s': @var{%s}" % (name, typ))
> > >
> > > Inappropriate use of @var. @var is for metasyntactic variables,
> > > i.e. something that stands for another piece of text. typ isn't, it's
> > > the name of a specific QAPI type. I think you should use @code.
> >
> > yes
> >
> > > This is the reason why the type names in qemu-qmp-ref.txt are often
> > > mangled, e.g.
> > >
> > > -- Struct: VersionInfo { 'qemu': VERSIONTRIPLE, 'package': STR }
> >
> > Right, we have format issue here. If we change it for @code, we get
> > additional quotes in the text format. The simpler is to use no format or
> > @t{}
>
> Pick something you like.
>
> > >> +
> > >> + return ", ".join(arg_list)
> > >> +
> > >> +
> > >> +def texi_body(doc):
> > >> + """
> > >> + Format the body of a symbol documentation:
> > >> + - a table of arguments
> > >> + - followed by "Returns/Notes/Since/Example" sections
> > >> + """
> > >> + def _section_order(section):
> > >> + return {"Returns": 0,
> > >> + "Note": 1,
> > >> + "Notes": 1,
> > >> + "Since": 2,
> > >> + "Example": 3,
> > >> + "Examples": 3}[section]
> > >> +
> > >> + body = "@table @asis\n"
> > >> + for arg, section in doc.args.iteritems():
> > >> + desc = str(section)
> > >> + opt = ''
> > >> + if desc.startswith("#optional"):
> > >> + desc = desc[10:]
> > >> + opt = ' *'
> > >> + elif desc.endswith("#optional"):
> > >> + desc = desc[:-10]
> > >> + opt = ' *'
> > >> + body += "@item @code{'%s'}%s\n%s\n" % (arg, opt,
> texi_comment(desc))
> > >> + body += "@end table\n"
> > >> + body += texi_comment(str(doc.body))
> > >> +
> > >> + meta = sorted(doc.meta, key=lambda i: _section_order(i.name))
> > >> + for section in meta:
> > >> + name, doc = (section.name, str(section))
> > >> + func = texi_comment
> > >> + if name.startswith("Example"):
> > >> + func = texi_example
> > >> +
> > >> + body += "address@hidden address@hidden quotation" % \
> > >> + (name, func(doc))
> > >> + return body
> > >> +
> > >> +
> > >> +def texi_alternate(expr, doc):
> > >> + """
> > >> + Format an alternate to texi
> > >> + """
> > >> + args = texi_args(expr)
> > >> + body = texi_body(doc)
> > >> + return STRUCT_FMT(type="Alternate",
> > >> + name=doc.symbol,
> > >> + attrs="[ " + args + " ]",
> > >> + body=body)
> > >> +
> > >> +
> > >> +def texi_union(expr, doc):
> > >> + """
> > >> + Format an union to texi
> > >
> > > I think it's "a union".
> >
> > ok
> >
> > >> + """
> > >> + attrs = "@{ " + texi_args(expr, "base") + " @}"
> > >> + discriminator = expr.get("discriminator")
> > >> + if not discriminator:
> > >> + union = "Flat Union"
> > >> + discriminator = "type"
> > >> + else:
> > >> + union = "Union"
> > >
> > > Condition is backwards.
> >
> > fixed
> >
> > >> + attrs += " + '%s' = [ " % discriminator
> > >> + attrs += texi_args(expr, "data") + " ]"
> > >> + body = texi_body(doc)
> > >> +
> > >> + return STRUCT_FMT(type=union,
> > >> + name=doc.symbol,
> > >> + attrs=attrs,
> > >> + body=body)
> > >
> > > You're inventing syntax here. Example output:
> > >
> > > -- Union: QCryptoBlockOpenOptions { QCryptoBlockOptionsBase } +
> > > 'format' = [ 'qcow': QCRYPTOBLOCKOPTIONSQCOW, 'luks':
> > > QCRYPTOBLOCKOPTIONSLUKS ]
> > >
> > > The options that are available for all encryption formats when
> > > opening an existing volume
> > > Since: 2.6
> > >
> > > -- Flat Union: ImageInfoSpecific { } + 'type' = [ 'qcow2':
> > > IMAGEINFOSPECIFICQCOW2, 'vmdk': IMAGEINFOSPECIFICVMDK,
> 'luks':
> > > QCRYPTOBLOCKINFOLUKS ]
> > >
> > > A discriminated record of image format specific information
> > > structures.
> > > Since: 1.7
> > >
> > > Note that QCryptoBlockOpenOptions is actually a flat union, and
> > > ImageInfoSpecific a simple union. As I said, the condition is
> > > backwards.
> > >
> > > The meaning of the attrs part is unobvious. Familiarity with schema
> > > syntax doesn't really help.
> > >
> > > Could we simply use schema syntax here?
> >
> > Union: QCryptoBlockOpenOptions {
> > 'base': QCryptoBlockOptionsBase,
> > 'discriminator': 'format',
> > 'data': { 'qcow': QCryptoBlockOptionsQCow, 'luks':
> > QCryptoBlockCreateOptionsLUKS }
> >
> > }
> >
> > Doesn't look obvious either and pollute the documentation with
> > schema-specific parameters.
>
> The schema syntax for flat unions is pretty horrid :) I hope to improve
> it, but there's more urgent fish to fry.
>
> > > If not: whatever format you use, you first need to explain it.
> >
> > I am not sure my solution is the best and will remain, but ok let's try
> to
> > document it for now.
>
> Before you pour time into documenting what you have, we should probably
> discuss what we need.
>
Can this be marked as limitation and improved laster? The series is big,
patch is getting bigger over time, it's quite hard to handle all your
requirements in one go.
> >> +
> > >> +
> > >> +def texi_enum(expr, doc):
> > >> + """
> > >> + Format an enum to texi
> > >> + """
> > >> + for i in expr['data']:
> > >> + if i not in doc.args:
> > >> + doc.args[i] = ''
> > >> + body = texi_body(doc)
> > >> + return ENUM_FMT(name=doc.symbol,
> > >> + body=body)
> > >> +
> > >> +
> > >> +def texi_struct(expr, doc):
> > >> + """
> > >> + Format a struct to texi
> > >> + """
> > >> + args = texi_args(expr)
> > >> + body = texi_body(doc)
> > >> + attrs = "@{ " + args + " @}"
> > >> + base = expr.get("base")
> > >> + if base:
> > >> + attrs += " + %s" % base
> > >> + return STRUCT_FMT(type="Struct",
> > >> + name=doc.symbol,
> > >> + attrs=attrs,
> > >> + body=body)
> > >
> > > More syntax invention. Example output:
> > >
> > > -- Struct: BlockdevOptionsReplication { 'mode': REPLICATIONMODE,
> > > ['top-id': STR] } + BlockdevOptionsGenericFormat
> > >
> > > ''mode''
> > > the replication mode
> > > ''top-id'' *
> > > In secondary mode, node name or device ID of the root node
> who
> > > owns the replication node chain. Must not be given in
> primary
> > > mode.
> > > Driver specific block device options for replication
> > > Since: 2.8
> > >
> > > Meaning of the attrs part is perhaps more guessable here, but it's
> still
> > > guesswork.
> > >
> > > The meaning of * after ''top-id'' is also unobvious.
> > >
> > > Note the redundancy between the attrs part and the body: both state
> > > member names and optionalness. The body doesn't state member types and
> > > base type. If we fixed that, we could drop the attrs part and save us
> > > the trouble of defining and explaining a syntax for it.
> > >
> > > Let me take a step back. This document is about the QMP wire format.
> > > There are no such things as simple and flat unions on the wire, only
> > > JSON objects. QMP introspection duly describes a type's JSON objects,
> > > not how it's defined in QAPI. I think QMP documentation should ideally
> > > do the same.
> > >
> > > QMP introspection uses a common structure for struct, simple and flat
> > > union: common members, variants, and if variants, then the common
> member
> > > that serves as tag. See introspect.json for details.
> > >
> > > Base types are flattened away. Appropriate for introspection, but
> > > documentation shouldn't do that.
> > >
> > > I wrote "ideally" because it's probably too big a step. I'm willing to
> > > settle for something less than ideal.
> >
> > I don't have clear idea what to do here, so if we can leave that for
> later,
> > that would be nice for me (I am already spending more time than I
> imagined
> > I would on doc stuff)
>
> Me too, if that's any consolation...
>
> Perhaps I can find a bit of time to think so I can propose what we could
> do.
>
> > >> +
> > >> +
> > >> +def texi_command(expr, doc):
> > >> + """
> > >> + Format a command to texi
> > >> + """
> > >> + args = texi_args(expr)
> > >> + ret = expr["returns"] if "returns" in expr else ""
> > >> + body = texi_body(doc)
> > >> + return COMMAND_FMT(type="Command",
> > >> + name=doc.symbol,
> > >> + ret=ret,
> > >> + args="(" + args + ")",
> > >> + body=body)
> > >> +
> > >> +
> > >> +def texi_event(expr, doc):
> > >> + """
> > >> + Format an event to texi
> > >> + """
> > >> + args = texi_args(expr)
> > >> + body = texi_body(doc)
> > >> + return COMMAND_FMT(type="Event",
> > >> + name=doc.symbol,
> > >> + ret="",
> > >> + args="(" + args + ")",
> > >> + body=body)
> > >> +
> > >> +
> > >> +def texi_expr(expr, doc):
> > >> + """
> > >> + Format an expr to texi
> > >> + """
> > >> + (kind, _) = expr.items()[0]
> > >> +
> > >> + fmt = {"command": texi_command,
> > >> + "struct": texi_struct,
> > >> + "enum": texi_enum,
> > >> + "union": texi_union,
> > >> + "alternate": texi_alternate,
> > >> + "event": texi_event}
> > >> + try:
> > >> + fmt = fmt[kind]
> > >> + except KeyError:
> > >> + raise ValueError("Unknown expression kind '%s'" % kind)
> > >
> > > The try / except converts one kind of error into another. What does
> > > that buy us? As far as I can tell, this shouldn't ever happen.
> >
> > dropped
> >
> > >> +
> > >> + return fmt(expr, doc)
> > >> +
> > >> +
> > >> +def texi(docs):
> > >> + """
> > >> + Convert QAPI schema expressions to texi documentation
> > >> + """
> > >> + res = []
> > >> + for doc in docs:
> > >> + expr = doc.expr
> > >> + if not expr:
> > >> + res.append(texi_body(doc))
> > >> + continue
> > >> + try:
> > >> + doc = texi_expr(expr, doc)
> > >> + res.append(doc)
> > >> + except:
> > >> + print >>sys.stderr, "error at @%s" % doc.info
> > >> + raise
> > >> +
> > >> + return '\n'.join(res)
> > >> +
> > >> +
> > >> +def main(argv):
> > >> + """
> > >> + Takes schema argument, prints result to stdout
> > >> + """
> > >> + if len(argv) != 2:
> > >> + print >>sys.stderr, "%s: need exactly 1 argument: SCHEMA" %
> argv[0]
> > >> + sys.exit(1)
> > >> +
> > >> + schema = qapi.QAPISchema(argv[1], strict_doc=True)
> > >> + print texi(schema.docs)
> > >> +
> > >> +
> > >> +if __name__ == "__main__":
> > >> + main(sys.argv)
> > >> diff --git a/docs/qapi-code-gen.txt b/docs/qapi-code-gen.txt
> > >> index 2841c51..8bc963e 100644
> > >> --- a/docs/qapi-code-gen.txt
> > >> +++ b/docs/qapi-code-gen.txt
> > >> @@ -45,16 +45,13 @@ QAPI parser does not). At present, there is no
> place where a QAPI
> > >> schema requires the use of JSON numbers or null.
> > >>
> > >> Comments are allowed; anything between an unquoted # and the
> following
> > >> -newline is ignored. Although there is not yet a documentation
> > >> -generator, a form of stylized comments has developed for consistently
> > >> -documenting details about an expression and when it was added to the
> > >> -schema. The documentation is delimited between two lines of ##, then
> > >> -the first line names the expression, an optional overview is
> provided,
> > >> -then individual documentation about each member of 'data' is
> provided,
> > >> -and finally, a 'Since: x.y.z' tag lists the release that introduced
> > >> -the expression. Optional members are tagged with the phrase
> > >> -'#optional', often with their default value; and extensions added
> > >> -after the expression was first released are also given a '(since
> > >> +newline is ignored. The documentation is delimited between two lines
> > >> +of ##, then the first line names the expression, an optional overview
> > >> +is provided, then individual documentation about each member of
> 'data'
> > >> +is provided, and finally, a 'Since: x.y.z' tag lists the release that
> > >> +introduced the expression. Optional members are tagged with the
> > >> +phrase '#optional', often with their default value; and extensions
> > >> +added after the expression was first released are also given a
> '(since
> > >> x.y.z)' comment. For example:
> > >>
> > >> ##
> > >> @@ -73,12 +70,49 @@ x.y.z)' comment. For example:
> > >> # (Since 2.0)
> > >> #
> > >> # Since: 0.14.0
> > >> + #
> > >> + # Notes: You can also make a list:
> > >> + # - with items
> > >> + # - like this
> > >> + #
> > >> + # Example:
> > >> + #
> > >> + # -> { "execute": ... }
> > >> + # <- { "return": ... }
> > >> + #
> > >> ##
> > >> { 'struct': 'BlockStats',
> > >> 'data': {'*device': 'str', 'stats': 'BlockDeviceStats',
> > >> '*parent': 'BlockStats',
> > >> '*backing': 'BlockStats'} }
> > >
> > > This example gives an idea of how to document struct types. But the
> > > reader is left guessing how to document other kinds of definitions. In
> > > particular, there's no mention of "Returns", "Note" and "Examples"
> > > sections anywhere in this file. As is, this could do for a tutorial,
> > > but this file is a *reference*, not a tutorial.
> > >
> > > For a reference, we need to be more thorough. A doc comments section
> on
> > > its own seems advisable.
> > >
> > > I guess doc comment examples are best added to the schema code examples
> > > in sections === Struct types ===, ..., === Events ===.
> > >
> > > Careful review for completeness is advised.
> >
> > So far we did quite fine with the generic example. I am not convinced we
> > need to have doc comments for all kinds of types, it looks redundant to
> me.
>
> Well, so far we didn't require people to write well-formed doc comments.
>
> Without such comments in qapi-code-gen.txt's examples, they become
> incomplete. Pity, because I like to feed them to the QAPI generators;
> extracted like this:
>
> $ sed '1,/\$ cat example-schema.json/d;/^[^ ]/,$d;s/^ //'
> ../docs/qapi-code-gen.txt >example-schema.json
>
> > >> +It's also possible to create documentation sections, such as:
> > >> +
> > >> + ##
> > >> + # = Section
> > >> + # == Subsection
> > >> + #
> > >> + # Some text foo with *strong* and _emphasis_
> > >> + # 1. with a list
> > >> + # 2. like that
> > >> + #
> > >> + # And some code:
> > >> + # | $ echo foo
> > >> + # | -> do this
> > >> + # | <- get that
> > >> + #
> > >> + ##
> > >> +
> > >> +Text *foo* and _foo_ are for "strong" and "emphasis" styles (they do
> > >> +not work over multiple lines). @foo is used to reference a symbol.
> > >> +
> > >> +Lines starting with the following characters and a space:
> > >> +- | are examples
> > >> +- = are top section
> > >> +- == are subsection
> > >> +- X. or X) are enumerations (X is any number)
> > >> +- o/*/- are itemized list
> > >> +
> > >
> > > Is this doc markup documentation complete?
> >
> > I think so
> >
> > > Is it related to any existing text markup language? Hmm, the commit
> > > message calls it "Markdown-like". Should we mention that here?
> >
> > I'll use "annotations" instead in the doc
> >
> > > Is all markup valid in all contexts? For instance, is = Section valid
> > > in symbol blocks? What does | (or any markup for that matter) do
> within
> > > an Example section?
> >
> > Example is verbatim<
--
Marc-André Lureau