[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH v14 13/21] qdict: allow qdict_crumple to accept

From: Markus Armbruster
Subject: Re: [Qemu-devel] [PATCH v14 13/21] qdict: allow qdict_crumple to accept compound types as values
Date: Tue, 18 Oct 2016 17:35:58 +0200
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/24.5 (gnu/linux)

Kevin Wolf <address@hidden> writes:

> Am 17.10.2016 um 16:50 hat Markus Armbruster geschrieben:
>> Kevin Wolf <address@hidden> writes:
>> > Am 13.10.2016 um 14:35 hat Markus Armbruster geschrieben:
>> >> Cc: Kevin for discussion of QemuOpts dotted key convention
>> >> 
>> >> "Daniel P. Berrange" <address@hidden> writes:
>> >> 
>> >> > Currently qdict_crumple requires a totally flat QDict as its
>> >> > input. i.e. all values in the QDict must be scalar types.
>> >> >
>> >> > In order to have backwards compatibility with the OptsVisitor,
>> >> > qemu_opt_to_qdict() has a new mode where it may return a QList
>> >> > for values in the QDict, if there was a repeated key. We thus
>> >> > need to allow compound types to appear as values in the input
>> >> > dict given to qdict_crumple().
>> >> >
>> >> > To avoid confusion, we sanity check that the user has not mixed
>> >> > the old and new syntax at the same time. e.g. these are allowed
>> >> >
>> >> >    foo=hello,foo=world,foo=wibble
>> >> >    foo.0=hello,foo.1=world,foo.2=wibble
>> >> >
>> >> > but this is forbidden
>> >> >
>> >> >    foo=hello,foo=world,foo.2=wibble
>> >> 
>> >> I understand the need for foo.bar=val.  It makes it possible to specify
>> >> nested dictionaries with QemuOpts.
>> >> 
>> >> The case for foo.0=val is less clear.  QemuOpts already supports lists,
>> >> by repeating keys.  Why do we need a second, wordier way to specify
>> >> them?
>> >
>> > Probably primarily because someone didn't realise this when introducing
>> > the dotted syntax.
>> Can't even blame "someone" for that; it's an obscure, underdocumented
>> feature of an interface that's collapsing under its load of obscure,
>> underdocumented features.
>> On the other hand, that's not exactly a state that allows for *more*
>> obscure features.
> I don't really think we're introducing more obscure features here, but
> rather trying to implement a single, and rather straightforward, way as
> the standard.
> Dotted syntax for hierarchy has actually plenty of precedence in qemu if
> you look a bit closer (the block layer, -global, -device foo,help, even
> the bus names you mentioned below are really just flattened lists), so
> we're only making things more consistent.
>> >                    Also because flat QDicts can't represent this.
>> Explain?
> Repeated options are parsed into QLists. If you keep it at that without
> flattening you have at least a QDict that contains a QList that contains
> simple types. This is not flat any more.

*All* options are parsed into a (non-QList) list.  That's what is flat.

Only when you start crumpling things go beyond flat, and QDict/QList
come into play.

> Of course, you could argue that flat QDicts are the wrong data structure
> in the first place and instead of flatting everything we should have
> done the equivalent of qdict_crumple from the beginning, but they were
> the natural extension of what was already there and happened to work
> good enough, so this is what we're currently using.

We didn't "flatten everything", because QemuOpts is and has always been
flat to begin with.  Rather we've started to crumple a few things, and
in a separate layer.

I now think this is the wrong approach.  We have clearly outgrown flat
options.  We should redo QemuOpts to support what we need instead of
bolting on an optional crumpling layer.

I guess I reached the place you described, just on a different path :)

>> > But even if I realised that QemuOpts support this syntax, I think we
>> > would still have to use the dotted syntax because it's explicit about
>> > the index and we need that because the list can contains dicts.
>> >
>> > Compare this:
>> >
>> >     driver=quorum,
>> >     child.0.driver=file,child.0.filename=disk1.img,
>> >     child.1.driver=host_device,child.1.filename=/dev/sdb,
>> >     child.2.driver=nbd,child.2.host=localhost
>> >
>> > And this:
>> >
>> >     driver=quorum,
>> >     child.driver=file,child.filename=disk1.img,
>> >     child.driver=host_device,child.filename=/dev/sdb,
>> >     child.driver=nbd,child.host=localhost
>> Aside: both are about equally illegible, to be honest.
> Not sure about equally, but let's agree on "both are illegible".
>> > For starters, can we really trust the order in QemuOpts so that the
>> > right driver and filename are associated with each other?
>> The order is trustworthy, but...
>> > We would also have code to associate the third child.driver with the
>> > first child.host (because file and host_device don't have a host
>> > property). And this isn't even touching optional arguments yet. Would
>> > you really want to implement or review this?
>> ... you're right, doing lists by repeating keys breaks down when
>> combined with the dotted key convention's use of repetition to do
>> structured values.
>> Permit me to digress.
>> QemuOpts wasn't designed for list-values keys.  Doing lists by
>> repetition was clever.
>> QemuOpts wasn't designed for structured values.  Doing structured values
>> by a dotted key convention plus repetition was clever.
>> And there's the problem: too much cleverness, not enough "this is being
>> pushed way beyond its design limits, time to replace it".
>> For me, a replacement should do structured values by providing suitable
>> *value* syntax instead of hacking it into the keys:
>>     { "driver": "quorum",
>>       "child": [ { "driver": "file", "filename": "disk1.img" },
>>                  { "driver": "host_device", "filename=/dev/sdb" },
>>                  { "driver": "nbd", "host": "localhost" } ] }
>> Note how the structure is obvious.  It isn't with dotted keys, not even
>> if you order them sensibly (which users inevitably won't).
>> Also not that the value needs to be parsed by QemuOpts!  You can't leave
>> it to the consumer of the QemuOpts, because if you did, you'd have to
>> escape the commas.
> Okay, so I like JSON. It's a great format for our monitor protocol. We
> even have pretty printers so that it's more or less readable as output
> for human users. However, there is one thing for which it is horrible:
> Getting user input.
> Yes, getting rid of the comma escaping is a first step, but typing JSON
> on the command line remains a PITA. Mostly because of the quotes (which
> you probably have to escape), but also things like the useless outer
> brackets.

As long as you don't need "'" in your JSON, you can simply enclose in
"'" and be done.  Since "'" can only occur in JSON strings, and the same
strings would be present in any other syntax, any other syntax would
be pretty much the same PITA.

>> If you'd rather invent syntax closer to QemuOpts than reuse JSON, you
>> could try
>>     driver=quorum,
>>     child=[{ driver=file, filename=disk1.img },
>>            { driver=host_device, filename=/dev/sdb },
>>            { driver=nbd, host=localhost } ]
> This looks a lot more agreeable as something that I have to type on the
> command line.
> What's more, this is a direct extension of the existing syntax. You
> complained about how the step from simple configurations with -hda to
> more complex ones with -drive completely changes the syntax (and rightly
> so). Going from simple QemuOpts syntax to doing JSON as soon as you get
> structured values and lists would be another similar step. In contrast,
> the new syntax you're suggesting above is a natural extension of what's
> there.

Point taken.  I just don't like inventing syntax, because as a rule, way
too much syntax gets invented, and almost always badly.

>> I'd go with some existing syntax, though.  The one we already use is
>> JSON.
> The one we already use in this context is comma separated key/value
> pairs as parsed by QemuOpts.

Which is flat, and flat doesn't cut it anymore.

If you make it non-flat with dotted key convention, the dotted key
convention becomes part of the syntax.  Counts as inventing syntax in my

>> Your dotted key convention requires two rules: 1. names must not look
>> like integers, and 2. names must not contain '.'.
>> We can avoid rule 2 by requiring '.' to be escaped.  Dan's
>> qdict_crumple() currently does that, to your surprise.  Adding the
>> escaping to existing options is a compatibility break, however.  So, if
>> names with '.' already exist, we can either break compatibility by
>> renaming them, or break it by requiring the '.' to be escaped.
> I don't think we should support any escaping and rather forbid '.'
> completely in names.

I think we should adopt QAPI's naming rules.

>> The names in question are the names declared in QemuOptDesc (easy enough
>> to find) plus any names that the code may use with empty QemuOptDesc
>> (harder).  Empty QemuOptDesc are used with the following option groups:
>> * "drive", qemu_drive_opts in blockdev.c
>>   Also empty_opts in qemu-io.c and qemu_drive_opts in test-replication.c
>> * "numa", qemu_numa_opts in numa.c
>> * "device", qemu_device_opts in qdev-monitor.c
>>   This one pulls in qdev properties.  Properties violating rule 2 exist.
> You posted a list of them in another email. Apart from not being
> user-configurable, I think it actually makes sense to interpret these as
> being structured.
> A lot of them are really array/list properties with a single entry. It
> seems we have some inconsistency there as to whether it the syntax is
> 'list.index' or 'list[index]'.


>                                The former matches the dotted syntax
> rule and is used in external interfaces (e.g. in bus=...), and I seem to
> remember the latter is a newer QOM thing.

Commit 3396590 "qom: Add automatic arrayification to
object_property_add()".  I tried to strangle it in the crib,
Message-Id: <address@hidden>

> Would it be too late to change the latter into the former? In other
> words, is the difference externally visible outside of qom-*, which we
> declared a non-ABI, as far as I know?


>> * "object", qemu_object_opts in vl.c
>>   Also qemu_object_opts in qemu-img.c, qemu-io.c and qemu-nbd.c (WTF?)
>>   This one pulls in properties of user-creatable objects.
>> * "source", qemu_soource_opts (sic!) in qemu-img.c
>> * "reopen", reopen_opts in qemu-io-cmds.c
>> * "file", file_opts in qemu-io.c and qemu-nbd.c
>> * "machine", qemu_machine_opts in vl.c
>>   This one pulls in machine properties.
>> * "tpmdev", qemu_tpmdev_opts in vl.c
>> * "netdev", qemu_netdev_opts in net.c
>> * "net", qemu_net_opts in net.c
>> * "userdef", userdef_opts in tests-opts-visitor.c
>> * "opts_list_03", opts_list_03 in test-qemu-opts.c
>> * "acpi", qemu_acpi_opts in hw/acpi/core.c
>> * "smbios", qemu_smbios_opts in smbios.c
>> Empty QemuOptDesc is used this widely because QemuOpts is too primitive
>> to support the needs of its users.  Unlike a QAPI schema.  Just sayin'.
> I don't think you have to convince anyone that QemuOpts is too limited.
> But today it's still a piece that is needed to get from the command line
> to something superior like QAPI. This is the whole reason why we're
> doing all of this.
> Completely QAPIfication of everything and going directly from the
> command line to QAPI objects would be nice, but somehow I feel it's not
> a realistic hope to get this as a short term solution.

No argument.  However, we should try hard to avoid short term hacks that
make a QAPI-based solution harder.  I believe the hard part will be
retaining all the special cases that got in for convenience, or by
accident, then were repurposed for whatever.  And that's why I'm
resisting additional syntactic/semantic cleverness *especially* when
it's used only sometimes.  Stuff that's used only sometimes can easily
conflict with stuff used at other times.

I don't think we understand the existing cleverness sufficiently to
predict interactions with new cleverness with confidence.

>> Finding the names and assessing how they're impacted by the dotted key
>> convention is going to be a substantial chunk of work.
> Possibly.
> The other option would be to take it step by step and let every user of
> QemuOpts specify on creation whether dots are used for structure or
> should be taken literally. Then you can convert user by user and assess
> them one at a time (which is still a substantial chunk for -object and
> -device, but at least you can concentrate on only these for now).

I'm wary of protracted iterative conversion.  We're better at starting
these than finishing them.  The started-but-not-finished state is
exactly the state that scares me here: still more special cases
basically nobody fully understands.

> If you're concerned about compatibility issues if we find a dot in a
> name only in one of the later users: We can handle it. If you then
> stumble across an obscure "weird.name" property where the dot is really
> part of the name and not already representing structure, you can still
> artificially reinterpret it as a nested structure with a single field
> even if logically that's not what it is... Might be a bit ugly, but
> keeps it working.

Yes, that's how we'd do backward compatibility, if needed.

So, what can we do?  Perhaps something like this:

* Pick a concrete syntax that supports nested stuff.  Proposals on the
  table so far are

  1. Dotted key convention as used in the block layer

  2. Array syntax as used in QOM's automatic arrayification (not a
     complete solution, mentioned because it conflicts with the

  3. JSON

  4. JSON with ':' replaced by '=', double-quotes and the outermost pair
     of braces dropped.

* Replace the QemuOpts internal representation: QDict instead of list.

* Replace the QemuOpts parser.

* Add suitable struct- and list-valued accessors.

  qdict_crumple(qemu_opt_to_qdict(...), true, &errp) gets replaced by a
  trivial "get the whole dictionary" operation.

  Other uses of qemu_opt_to_qdict() need to be rewritten.

* "Repetition builds a list" backward compatibility.  It's used in only
  a few places.

* Possibly "keys with funny characters" backward compatibility.

Feels doable to me.  2.8 would be ambitious, though: soft freeze in in
less than two weeks.  However, getting this series into 2.8 has started
to feel ambitious to me, too.

reply via email to

[Prev in Thread] Current Thread [Next in Thread]