[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Qemu-devel] The State of the SaveVM format
From: |
Juan Quintela |
Subject: |
[Qemu-devel] The State of the SaveVM format |
Date: |
Wed, 09 Sep 2009 10:47:27 +0200 |
A Sad History of a Doomed Format
--------------------------------
When the world was old, there was a V1 of savevm format (nothing more
to tell about it).
Then appeared version 2. It was a very simple format. Only fields were:
- instance_id
- version_id
- record_len
You can see it at savevm.c::qemu_loadvm_state_v2()
ToDo: Create an image with v2, and see if we can still read it,
otherwise, remove support to load v2 format.
And then v3 appeared
commit 9366f4186025e1d8fc3bebd41fb714521c170b6f
Author: aliguori <address@hidden>
Date: Mon Oct 6 14:53:52 2008 +0000
Introduce v3 of savevm protocol
Features:
* Support for progressive save of sections (for live checkpoint/migration)
* An asynchronous API for doing save
* Support for interleaving multiple progressive save sections
(for future support of memory hot-add/storage migration)
* Fully streaming format
* Strong section version checking
At this point, all the save/load of images were done in plain C with
functions that did anything that they wanted. Life was nice and good
while things worked. When they didn't worked, you only knew that they
didn't worked. No info at all why. Qemu SaveVM format was an opaque
thing that only a corrected configured qemu is able to read.
Fast Forward to the present, and it appears VMState. What does it?
It allows you to specify the state as a table, and then the save
function walks the table and save all the fields. The load function
walks the table and loads all the fields. Save and Load functions are
obviously always on sync, because they are done walking the same table.
And life was good .... Ooops, no, it was not good.
The problems is what to do from here:
- We can have a very simple VMState format that only allows storing
simple types (int32_t, uint64_t, timers, buffers of uint8_t, ...)
Arrays of valid types
Structs of valid types
And that is it. Advantage of this approach, it is very simple to
create/test/whatever. Disadvantage: it can't express all the things
that were done in plain C. Everybody agrees that we don't want to
support everything that was done in plain C in the old way. What we
are discussing is "how many" things do we want to support. Notice
that we can support _everything_ that we were doing with plain C.
Anytime that you want to do something strange, you just need to write
your own marshaling functions and you are done. You do there
anything that you want.
We are here at how we want to develop the format. People that has
expressed opinions so far are:
- Gerd: You do a very simple format, and if the old state can't be
expressed in simple VMState, you just use the old load
function. This maintains VMState clean, and you can load
everything that was done before. Eventually, we remove the
old load state functions when we don't support so old format.
- Anthony: If we leave the old load state functions, they will be
around forever. He wants to complicate^Wimprove VMState
to be able to express everything that was done in plain C.
Reason: It is better to only have _one_ set of functions.
- Paul?: I think he told that testing that we can load old state
is impossible, and it is better to just remove the ability
of load from old versions (I think this was Paul position, but
discussion was a month ago, and my memory is not perfect)
I guess that if I am misinterpreting anyone;, they will let you know,
don't worry :) As you can see, what we are searching here is the less
bad solution, All have advantages and disadvantages, and none is
"perfect" or obviously better than the others.
ToDo: Port all devices (for instance of a typical pc) to current simple
VMState and see how many things we are missing (Beware: Dragons in
virtio)
Another day, another problem, this time called: Optional features.
How do we deal with optional features?
- We add feature bits (something like PCI does with optional features,
the exact implementation is not important). When we add an optional
feature to a driver, we just implement the save function as:
- if we are using the feature, we add the feature bit indicating that
we are using the feature, and we save the state for that feature.
- at load time: If we find a feature that we don't understand, we
just abort the load.
- at load time: if you miss a feature that you need -> you also abort
This has a nice advantage, if you load the state from old qemu, you
don't use the new feature, and you save the state -> you can still
load the state in old qemu (this is a nice theory, we don't know how
it would work on practice). Another advantage is that you can code
and test each option separately. Michael S. Tsirkin likes this mode.
- The other position: Optional features? Such a thing don't exist :)
Why? Because if there are not optional features, you always know
with only version + name of device if you support it or not (with
optional features, you have another failure mode: you can find
a feature that you don't understand in the middle of loading the state
that can't happen if there is not optional features.
But, we really, really want optional features (they throw msix support
again). No problem, you just create _another device:
VMStateDescription vmstate_virtio-net = ...
VMStateDescription vmstate_virtio-net_msix =
VMSTATE_STRUCT(vmstate_net);
.... msix bits
You explicitly tells what optional features you want to use. Notice
that you can convince qdev to make the right thing:
--device net,model=virtio,msix=on (loads virtio-net-msix)
--device net,model=virtio,msix=off (loads plain virtio-net)
Advantages, you only support the combinations that made sense, you
explicitly state what they are, and VMState continues to be simple.
Why don't use optional features? Because then test matrix explodes
exponentially, for each optional feature, you multiply by two the
number of tests that you have to do. Disadvantage is that obviously
you end having more devices (although they can be implemented in the
same file and share almost all the code, see how vga-pci and vga-isa
share almost all the code).
Not having optional features, have another interesting property.
Versions of a device are linear in the sense that each new version is a
superset of the previous one (i.e. the same fields than the previous one
plus some more). This makes support for loading of old versions way
easier. Here put Juan (i.e. me) and I think that in the past Gerd
liked something like this.
To help make a decision here, it is a good idea to look at all the
devices and see if when they add more fields for a new version, they do
typically as:
- they add optional features
- they add them because now the simulation is better/whatever (they are
_not_ optional)
Notice that again, both approaches have advantages and disadvantages, it
just depend of what your priorities are :)
More problems: Going from newer versions to old versions
- I think that everybody thinks that this is a nice to have, but that it
will took a lot to make it work, and there are more urgent things to
do.
Notice that there are plans for VMState to do more interesting things
like:
- Be able to show the values in a saved image
- See if a VM is able to load a vmstate (i.e. it has the needed devices
at the needed versions)
- .....
That ones are independent of what we decided for the previous problems.
Comments? Things that I missed for the discussion?
Later, Juan.
- [Qemu-devel] The State of the SaveVM format,
Juan Quintela <=