qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [RFC PATCH 0/4] Fix subsection ambiguity in the migrati


From: Anthony Liguori
Subject: Re: [Qemu-devel] [RFC PATCH 0/4] Fix subsection ambiguity in the migration format
Date: Mon, 25 Jul 2011 18:23:17 -0500
User-agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.2.17) Gecko/20110516 Lightning/1.0b2 Thunderbird/3.1.10

On 07/25/2011 04:10 PM, Paolo Bonzini wrote:
On Thu, Jun 30, 2011 at 17:46, Paolo Bonzini<address@hidden>  wrote:
With the current migration format, VMS_STRUCTs with subsections
are ambiguous.  The protocol cannot tell whether a 0x5 byte after
the VMS_STRUCT is a subsection or part of the parent data stream.
In the past QEMU assumed it was always a part of a subsection; after
commit eb60260 (savevm: fix corruption in vmstate_subsection_load(),
2011-02-03) the choice depends on whether the VMS_STRUCT has subsections
defined.

Unfortunately, this means that if a destination has no subsections
defined for the struct, it will happily read subsection data into
its own fields.  And if you are "lucky" enough to stumble on a
zero byte at the right time, it will be interpreted as QEMU_VM_EOF
and migration will be interrupted with half-loaded state.

There is no way out of this except defining an incompatible
migration protocol.  Not-so-long-term we should really try to define
one that is not a joke, but the bug is serious so we need a solution
for 0.15.  A sentinel at the end of embedded structs does remove the
ambiguity.

Of course, this can be restricted to new machine models, and this
is what the patch series does.  (And note that only patch 3 is specific
to the short-term solution, everything else is entirely generic).

Untested beyond compilation.

I have now tested this series (exactly as sent) both by examining
manually the differences between the two formats on the same guest
state, and by a mix of saves/restores (new on new, 0.14 on new
pc-0.14, new pc-0.14 on 0.14; also the same combinations on RHEL).  It
always does what is expected.

Michael Tsirkin objected that the format should be passed as a
parameter in the migrate command.  I kind of agree, however since this
is a real bug you would need to bump the default for new machine
types, and this default would still go in the QEMUMachine struct like
I am doing.  So I consider the two settings to be orthogonal.  Also,
the alternative requires changes to the whole management stack and if
the default is not changed it imposes a broken format unless you
update the management tools.  Clearly much less bang for the buck.

I think this is ready to go into 0.15.

I'll take a look for 0.15.

The bug happens when migrating
to 0.14 a pc-0.14 machine created with QEMU 0.15 and which has a
floppy.  The media changed subsection is almost always included, and
this causes problems when migrating to 0.14 which didn't have any
subsection for the floppy device.  While QEMU support for migration to
old version admittedly depends on luck, this isn't true of certain
downstreams :) which would like to have an unambiguous migration
format.

So this got me thinking about where we're at with migration and where we need to go.

I actually think there might be a reasonable path forward if we attack the problem differently than we have so far.

== Today ==

Today we only support generating the latest serialization of devices. To increase the probability of the latest version working on older versions of QEMU, we strategically omit fields that we know can safely be omitted with older versions (subsections). More than likely, migrating new to old won't work.

Migrating old to new is more likely to work. We version each section in order to be able to identify when we're dealing with old.

But all of this logic lives in one of two forms. Either as a savevm/loadvm callback that takes a QEMUFile and writes byte serialization to the stream in an open way (usually big endian) or encoded declaratively in a VMState section.

== What we need ==

We need to decompose migration into three different problems: 1) serializing device state 2) transforming the device model in order to satisfy forwards and backwards compatibility 3) encoding the serialized device model on the wire.

We also need a way to future proof ourselves.

== What we can do ==

1) Add migration capabilities to future proof ourselves. I think the simplest way this would work is to have a 'query-migration-capabilities' command that returned a bitmask of supported migration features. I think we also introduce a 'set-migration-capabilities' command that can mask some of the supported features.

A management tool would query-migration features on the source and destination, take the intersection of the two masks, and set that mask on both the source and destination.

Lack of support for these commands indicates a mask of zero which is the protocol we offer today.

2) Switch to a visitor model to serialize device state. This involves converting any occurance of:

qemu_put_be32(f, port->guest_connected);

To:

visit_type_u32(v, "guest_connected", &port->guest_connected, &local_err);

It's 100% mechanical and makes absolutely no logic change. It works equally well with legacy and VMstate migration handlers.

3) Add a Visitor class that operates on QEMUFile.

At this state, we can migrate to data structures. That means we can migrate to QEMUFile, QObjects, or JSON. We could change the protocol at this stage to something that was still binary but had section sizes and things of that nature.

But we shouldn't stop here.

4) Compatibility logic should be extracted from the savevm functions and VMstate functions into separate functions that take a data structure. Basically, we want to have something roughly equivalent to:

QObject *e1000_migration_compatibility(QObject *src, int src_version, int dst_version);

We can have lots of helpers that reuse the VMstate declarative stuff to do this but this should be registered independent of the main serialization handler.

This moves us to a model where we always generate the latest serialization format, and then have specific ways to convert to older mechanisms. It allows us to do very big backwards compatibility steps like convert the state of one device into two separate devices (because we're just dealing with in-memory data structures).

It's this step that lets us truly support compatibility with migration. The good news is, it doesn't have to be all or nothing. Since we always already generate the latest serialization format, the existing code only deals with migrating older versions to the latest which is something that isn't all that important.

So if we did this in 1.0, we could have a single function that converted the 1.0 device model to 1.1 and vice versa, and we'd be fine. We wouldn't have to touch 200 devices to do this.

5) Once we're here, we can implement the next 5-year format. That could be ASN.1 and be bidirectional or whatever makes the most sense. We could support 50 formats if we wanted to. As long as the transport is distinct from the serialization and compat routines, it really doesn't matter.

Regards,

Anthony Liguori

Paolo




reply via email to

[Prev in Thread] Current Thread [Next in Thread]