qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [RFC] New Migration Protocol using Visitor Interface


From: Anthony Liguori
Subject: Re: [Qemu-devel] [RFC] New Migration Protocol using Visitor Interface
Date: Mon, 03 Oct 2011 07:51:00 -0500
User-agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.2.21) Gecko/20110831 Lightning/1.0b2 Thunderbird/3.1.13

On 10/03/2011 01:46 AM, Michael S. Tsirkin wrote:
On Mon, Sep 19, 2011 at 09:41:41AM -0500, Michael Roth wrote:
OVERVIEW

This patch series implements a QEMUFile Visitor class that's intended to abstract 
away direct calls to qemu_put_*/qemu_get_* for save/load functions. Currently this 
is done by always creating a QEMUFileInputVisitor/QEMUFileOutputVisitor pair with 
each call to qemu_fopen_ops() and maintaining a QEMUFile->I/O Visitor mapping. 
save/load functions that are to be converted would them simply use lookup routines 
to get a Visitor based on their QEMUFile arguments. Once these save/load functions 
are all converted, we can then change the interfaces in bulk and switch over to 
passing in the Visitor directly.

An example conversion of Slirp, which still uses old-style save/load routines, 
is included as part of this series. Anthony I believe has VMState converted 
over to using Visitors, so theoretically all VMStatified devices are good to go 
(Anthony: if that's the case feel free to rebase on this or point me to the 
repo to pull in, and I'll work off the resulting branch).

PLANS

Ultimately, the goal is to implement a new migration protocol that is 
self-describing, such that incompatibilities or migration errors can be more 
easilly detected. Currently, a simple change in data types for a particular 
device can introduce subtle bugs that won't be detected by the target, since 
the target interprets the data according to it's own expectation of what those 
data types are. Currently the plan is to use ASN.1 BER in place of QEMUFile.

By using a Visitor interface for migration, we also gain the ability to generate a 
migration schema (via, say, a JSONSchemaVisitor). This is similar to the VMState schema 
generator in Anthony's "Add live migration unit tests" series 
(http://thread.gmane.org/gmane.comp.emulators.qemu/97754/focus=97754), except that it is 
applicable to both VMState and old-style save/load functions.

There is still quite a bit a work to get to that point. Anthony addressed some 
of the issues on the VMState side of things in the aforementioned series, and I 
believe he already has some patches that convert VMState over to using visitors 
insteads of qemu_put_*/qemu_get_* directly. That being the case, what's left is:

1) Convert old-style save/load functions over to using visitors. I'm not sure 
what the best approach is for this. We basically have 2 options: either by 
converting them to VMState, or converting the functions over to using visitors, 
such as with the example slirp conversion. Even if we do option 2 it would 
likely need to be done in 2 phases:

phase 1: a mechanical conversion where we basically replace each 
qemu_put*/qemu_get* with a visit_*. This allows use to plop in a new, say 
BERVisitor to switch to using the new migration protocol. It should also be 
possible to do this without breaking migration compatibility with older 
versions. It does not however result in a well-specified schema if we were to 
plop in, say, a JSONSchemaVisitor. This is where we need phase 2.

phase 2: Currently, runtime state modifies our "schema" the way things are 
currently done. Slirp instance and virtqueue enumeration, for instance:

slirp/slirp.c:
     for (ex_ptr = slirp->exec_list; ex_ptr; ex_ptr = ex_ptr->ex_next)
         if (ex_ptr->ex_pty == 3) {
             struct socket *so;
             so = slirp_find_ctl_socket(slirp, ex_ptr->ex_addr,
                                        ntohs(ex_ptr->ex_fport));
             if (!so)
                 continue;

             qemu_put_byte(f, 42);
             slirp_socket_save(f, so);
         }

hw/virtio.c:
     for (i = 0; i<  VIRTIO_PCI_QUEUE_MAX; i++) {
         if (vdev->vq[i].vring.num == 0)
             break;

         qemu_put_be32(f, vdev->vq[i].vring.num);
         qemu_put_be64(f, vdev->vq[i].pa);
         qemu_put_be16s(f,&vdev->vq[i].last_avail_idx);
         if (vdev->binding->save_queue)
             vdev->binding->save_queue(vdev->binding_opaque, i, f);
     }


To be able to build a schema around this we'd need to do a visit_start_array() or 
visit_start_list(), to denote a list of entries, and ideally we'd also describe structure 
of these entries. But often there is no structure of these entries...they may just be 
values taken from various data structures. So, either we do hacky things like using 
visitor intefaces to create "fake" structs (we say we're dealing with a struct 
when we're really not), or we actually put these into an intermediate struct which we 
then use to assign to store/load migration fields from.

phase 2 sounds a whole lot like converting over to VMState. So I think we'll 
only want to go with option 2 in places where converting to VMState is not 
practical/planned. Or we can hold off on the phase 2 stuff and just focus on 
getting the new protocol in place ASAP. After which, we can rework things to 
support generating a well-specified schema.

2) Once the conversion stuff is done, we can modify the save/load interfaces in 
bulk to accept a Visitor in place of a QEMUFile (but continue using the old 
protocol/QEMUFile*Visitor for the time being)

3) Implement migration capabilities negotiation. This was discussed to some 
extent in the XBZRLE v4 thread 
(http://lists.gnu.org/archive/html/qemu-devel/2011-08/msg01155.html), basically 
having a set of migration capabilities that can be queried via QMP. Management 
tools would then choose the optimal migration settings from the intersection of 
these.

4) Implement the BERVisitor and make this the default migration protocol.

Most of the work will be in 1), though with the implementation in this series 
we should be able to do it incrementally. I'm not sure if the best approach is 
doing the mechanical phase 1 conversion, then doing phase 2 sometime after 4), 
doing phase 1 + 2 as part of 1), or just doing VMState conversions which gives 
basically the same capabilities as phase 1 + 2.

Thoughts?

Yes, I think this creates much more work than an upfront conversion
of the protocol to BER, then cleaning up interfaces, would be.

Here are some suggestions:

- Let's make the protocol be BER directly.
   As a first step, use a single octet string for
   the whole of data. Next, start splitting this up.

This can't be done without breaking the old style migration protocol. I don't have a problem with that but I do have a problem with repeatedly breaking migration protocol.

The Visitor interface is very little abstraction over a native BER interface. It lets us introduce BER incrementally while still supporting the old protocol.


- Don't try to use arrays at all. They don't make sense
   in either jason or asn. Instead, use sets for unordered
   data and sequences for ordered data.

sequences map to visit_start_array() pretty well. You would do something like:

visit_start_array(v, "entries", errp);
for (int i = 0; i < s->size; i++) {
    visit_type_int(v, NULL, &s->entry[i], errp);
}
visit_end_array(v, errp);

- Most data should be unordered: we don't normally care
   in which order we get device attributes.
   So we would have set of sequences, and
   each sequence is attribute name (ascii string) + attribute data

I'd actually be inclined to make everything ordered by default and not encode explicit field names. I'm afraid that encoding field names on the wire is going to easily increase the amount of device state by an order of magnitude.

- This does not discuss how we will handle cross version
   migration in a lot of detail. Presumably you expect capability magic to
   work for this. However this does not work for migration to file.
   A self describing format makes a much simpler solution possible:
   separating attributes which destination must parse from those
   it can skip safely. If destination sees an attribute
   that was marked as must parse and does not recognize it,
   migration fails. If not, just skip the attribute.

Compatibility and the wire format are more or less orthogonal. A self describing format helps eliminate a class of bugs but getting compatibility right requires quite a bit more.

Regards,

Anthony Liguori




reply via email to

[Prev in Thread] Current Thread [Next in Thread]