[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Qemu-devel] Meeting notes on -blockdev, dynamic backend reconfiguration
From: |
Markus Armbruster |
Subject: |
[Qemu-devel] Meeting notes on -blockdev, dynamic backend reconfiguration |
Date: |
Mon, 05 Dec 2016 13:03:50 +0100 |
User-agent: |
Gnus/5.13 (Gnus v5.13) Emacs/24.5 (gnu/linux) |
I recently met Kevin, and we discussed two block layer topics in some
depth.
= -blockdev =
We want a command line option to mirror QMP blockdev-add for 2.9.
QemuOpts has to grow from "list of (key, simple value) plus conventions
to support lists of simple values in limited ways" to the expressive
power of JSON.
== Basic idea ==
QMP pipeline: JSON string - JSON parser - QObject - QObject input
visitor - QAPI object. For commands with with 'gen': false, we stop at
QObject. These are rare.
Command line now: option argument string - QemuOpts parser - QemuOpts.
We occasionally continue - options or string input visitor - QAPI
object. Both visitors can't do arbitrary QAPI objects. Both visitors
extend QemuOpts syntax.
Daniel Berrange posted patches to instead do - crumple - QObject -
qobject input visitor - QAPI object. Arbitrary QObjects (thus QAPI
objects) are possible with dotted key convention, which is already used
by block layer.
As before, a visitor sits on top of QemuOpts, providing syntax
extensions. Stacking parsers like that is not a good idea. We want
*one* option argument parser, and we need it to yield a QObject.
== Backward compatibility issues ==
* Traditional key=value,... syntax
* The "repeated key is list" hack
* Options and string input visitor syntax extensions
* Dotted key convention
Hopefully, most of the solutions can be adapted from Daniel's patches.
== Type ambguity ==
In JSON, the type of a value is syntactically obvious. The JSON parser
yields QObject with these types. The QObject input visitor rejects
values with types that don't match the QAPI schema.
In the traditional key=value command line syntax, the type of a value
isn't obvious. Options and string input visitor convert the string
value to the type expected by the QAPI schema.
Unlike a QObject from JSON, a QObject from QemuOpts has only string
values, and the QObject input visitor needs to be able to convert
instead of reject. Daniel's patches do that.
== Action item ==
Markus to explore the proposed solution as soon as possible.
= Dynamic block backend reconfiguration =
== Mirror job ==
State before the job:
frontend
|
BB
|
BDS
Frontend writes flow down.
Passive mirror job, as it currently works:
frontend mirror-job
| | |
BB BB' BB2
| ____/ |
| / |
BDS BDS2
The mirror job copies the contents of BDS to BDS2. To handle frontend
writes, BDS maintains a dirty bitmap, which mirror-job uses to copy
updates from BB' to BB2.
Pivot to mirror on job completion: replace BB's child BDS by BDS2,
delete mirror-job and its BB', BB2.
frontend
|
BB
\_____________
\
BDS BDS2
Future mirror job using a mirror-filter:
frontend mirror-job
| |
BB BB'
| /
mirror-filter
| \
BDS BDS2
Passive mirror-filter: maintains dirty bitmap, copies from BDS to BDS2.
Active mirror-filter: no dirty bitmap, mirrors writes to BDS2 directly.
Can easily switch from passive to active at any time.
Pivot: replace parent of mirror-filter's child mirror-filter by BDS2,
delete mirror job and its BB'. "Parent of" in case other filters have
been inserted: we drop the ones below mirror-filter, and keep the ones
above.
== Backup job ==
Current backup job:
frontend backup-job
| | |
BB BB' BB2
| ____/ |
| / |
BDS BDS2
The backup job copies the contents of BDS to BDS2. To handle frontend
writes, BDS provices a before-write-notifier, backup-job uses it to copy
old data from BB' to BB2 right before it's overwritten.
Pivot: delete backup-job and its BB', BB2.
frontend
|
BB
|
|
BDS BDS2
Future backup job using a backup-filter:
frontend backup-job
| |
BB BB'
| /
backup-filter
| \
BDS BDS2
backup-filter copies old data from BDS to BDS2 before it forwards write
to BDS.
Pivot: replace parent of backup-filter's child backup-filter by BDS2,
delete backup-job and its BB'.
== Commit job ==
State before the job:
frontend
|
BB
|
QCOW2
file / \ backing
/ \
/ \
BDS1 QCOW2_top
file / \ backing
/ .
BDS_top .
\
BDS_base
"file" and "backing" are the QCOW2 child names for the delta image and
the backing image, respectively.
Frontend writes flow to BDS1.
Current commit job to commit from QCOW2_top down to BDS_base:
frontend
|
BB commit-job
| / \
QCOW2 BB_top BB_base
file / \ backing / /
/ \ ________/ /
/ \ / /
BDS1 QCOW2_top /
file / \ backing /
/ . /
BDS_top . _____/
\ /
BDS_base
commit-job copies anything allocated above BDS_base up to BDS_top from
BB_top to BB_base.
Pivot: replace backing child of QCOW2_top by BDS_base, delete commit-job
and its BB_top, BB_base.
frontend
|
BB
|
QCOW2
file / \ backing
/ \
/ \
BDS1 QCOW2_top
file / \ backing
/ \
BDS_top BDS_base
Drops any filters meanwhile inserted between QCOW2_top and BDS_base.
Should we have a (otherwise no op) commit-filter node to provide a place
for filters we want to keep? Would op blockers need / profit from such
a filter?
== Streaming job ==
Just like commit (hopefully).
== Basic dynamic reconfiguration operation ==
The basic operation is "replace child".
Beware of race conditions. Consider:
BB
|
mirror-filter
|
BDS
Add a throttle filter under BB while the mirror job is running. First
step, create the filter:
BB throttle-filter
| /
mirror-filter
|
BDS
Second step, replace child of BB by the new filter:
BB
|
throttle-filter
|
mirror-filter
|
BDS
But: if mirror-filter goes away between the two steps, the replace
brings it right back!
To guard against such races, we need to specify both ends of the edge
being replaced, i.e. parent, child name, actual child. Then the replace
step fails if the mirror-filter has gone away. We can either fail the
whole operation, or start over.
Alternatively, transactions, but that feels much more complex.
- [Qemu-devel] Meeting notes on -blockdev, dynamic backend reconfiguration,
Markus Armbruster <=
- Re: [Qemu-devel] [Qemu-block] Meeting notes on -blockdev, dynamic backend reconfiguration, Stefan Hajnoczi, 2016/12/06
- Re: [Qemu-devel] Meeting notes on -blockdev, dynamic backend reconfiguration, Fam Zheng, 2016/12/06
- Re: [Qemu-devel] [Qemu-block] Meeting notes on -blockdev, dynamic backend reconfiguration, Kevin Wolf, 2016/12/07
- Re: [Qemu-devel] [Qemu-block] Meeting notes on -blockdev, dynamic backend reconfiguration, Fam Zheng, 2016/12/07
- Re: [Qemu-devel] [Qemu-block] Meeting notes on -blockdev, dynamic backend reconfiguration, Markus Armbruster, 2016/12/08
- Re: [Qemu-devel] [Qemu-block] Meeting notes on -blockdev, dynamic backend reconfiguration, Fam Zheng, 2016/12/08
- Re: [Qemu-devel] [Qemu-block] Meeting notes on -blockdev, dynamic backend reconfiguration, Markus Armbruster, 2016/12/12
- Re: [Qemu-devel] [Qemu-block] Meeting notes on -blockdev, dynamic backend reconfiguration, Fam Zheng, 2016/12/12
- Re: [Qemu-devel] [Qemu-block] Meeting notes on -blockdev, dynamic backend reconfiguration, Markus Armbruster, 2016/12/12
- Re: [Qemu-devel] [Qemu-block] Meeting notes on -blockdev, dynamic backend reconfiguration, Fam Zheng, 2016/12/13