qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v3 1/6] migration: Allow immutable device state to be migrate


From: Peter Xu
Subject: Re: [PATCH v3 1/6] migration: Allow immutable device state to be migrated early (i.e., before RAM)
Date: Wed, 11 Jan 2023 12:28:16 -0500

On Wed, Jan 11, 2023 at 05:58:36PM +0100, David Hildenbrand wrote:
> On 11.01.23 17:35, Peter Xu wrote:
> > On Wed, Jan 11, 2023 at 02:48:09PM +0100, David Hildenbrand wrote:
> > > On 10.01.23 21:03, Peter Xu wrote:
> > > > On Tue, Jan 10, 2023 at 12:52:32PM +0100, David Hildenbrand wrote:
> > > > > The following seems to work,
> > > > 
> > > > That looks much better at least from the diffstat pov (comparing to the
> > > > existing patch 1+5 and the framework changes), thanks.
> > > > 
> > > > > but makes analyze-migration.py angry:
> > > > > 
> > > > > $ ./scripts/analyze-migration.py -f STATEFILE
> > > > > Traceback (most recent call last):
> > > > >     File "/home/dhildenb/git/qemu/./scripts/analyze-migration.py", 
> > > > > line 605, in <module>
> > > > >       dump.read(dump_memory = args.memory)
> > > > >     File "/home/dhildenb/git/qemu/./scripts/analyze-migration.py", 
> > > > > line 539, in read
> > > > >       classdesc = self.section_classes[section_key]
> > > > >                   ~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^
> > > > > KeyError: ('0000:00:03.0/virtio-mem-early', 0)
> > > > > 
> > > > > 
> > > > > We need the vmdesc to create info for the device.
> > > > 
> > > > Migration may ignore the save entry if save_state() not provided in the
> > > > "devices" section:
> > > > 
> > > >           if ((!se->ops || !se->ops->save_state) && !se->vmsd) {
> > > >               continue;
> > > >           }
> > > > 
> > > > Could you try providing a shim save_state() for the new virtio-mem save
> > > > entry?
> > > > 
> > > > /*
> > > >    * Shim function to make sure the save entry will be dumped into 
> > > > "devices"
> > > >    * section, to make analyze-migration.py happy.
> > > >    */
> > > > static void virtio_mem_save_state_early(QEMUFile *file, void *opaque)
> > > > {
> > > > }
> > > > 
> > > > Then:
> > > > 
> > > > static const SaveVMHandlers vmstate_virtio_mem_device_early_ops = {
> > > >       .save_setup = virtio_mem_save_setup_early,
> > > >       .save_state = virtio_mem_save_state_early,
> > > >       .load_state = virtio_mem_load_state_early,
> > > > };
> > > > 
> > > > I'm not 100% sure it'll work yet, but maybe worth trying.
> > > 
> > > It doesn't. virtio_mem_load_state_early() will get called twice (once with
> > > state saved during save_setup() and once with effectively nothing during
> > > save_state(), which breaks the whole migration).
> > 
> > Oh hold on.. so load_state() to pair save_setup()? Maybe you should pair it
> > with load_setup(), which I just noticed..  Then the load_state() needs to
> > be another shim like save_state().
> 
> Let me see if that improves the situation. E.g., ram.c writes state in
> save_setup() but doesn't load any state in load_setup() -- it's all done in
> load_state().
> 
> ... no, still not working. It will read garbage during load_setup() now.
> 
> qemu-system-x86_64: Property 'memaddr' changed from 0x100000002037261 to
> 0x140000000
> qemu-system-x86_64: Failed to load virtio-mem-device-early:tmp
> qemu-system-x86_64: Load state of device 0000:00:03.0/virtio-mem-early
> failed
> qemu-system-x86_64: load of migration failed: Invalid argument
> 
> 
> I don't think that load_setup() is supposed to consume anything useful from
> the migration stream. I suspects it should actually not even consume a QEMU
> file right now, because they way it's used is just for initializing stuff.
> 
> qemu_loadvm_state_main() does the actual loading part, parsing sections etc.
> qemu_loadvm_state_setup() doesn't do any of that.
> 
> AFAIKS, at least qemu_loadvm_state_setup() would have to parse sections and
> the save_setup() users would have to be converted into using load_setup() as
> well. Not sure if more is missing.

Ouch, yeah, load_setup() is unfortunate. :(

I think load_setup() should be named load_init() without having the
qemufile at all.  But let's keep that aside for now, and see what other
option we have..

> 
> Even with that, I doubt that it would make analyze-migration.py work,
> because what we store is something different than what we record in the
> vmdesc.
> 
> > 
> > > 
> > > vmdesc handling is also wrong, because analyze-migration.py gets confused
> > > because it receives data stored during save_setup() but vmdesc created
> > > during save_state() was told that there would be "nothing" to interpret 
> > > ...
> > > 
> > > $ ./scripts/analyze-migration.py -f STATEFILE
> > > {
> > >      "ram (2)": {
> > >          "section sizes": {
> > >              "0000:00:03.0/mem0": "0x0000000f00000000",
> > >              "pc.ram": "0x0000000100000000",
> > >              "/rom@etc/acpi/tables": "0x0000000000020000",
> > >              "pc.bios": "0x0000000000040000",
> > >              "0000:00:02.0/e1000.rom": "0x0000000000040000",
> > >              "pc.rom": "0x0000000000020000",
> > >              "/rom@etc/table-loader": "0x0000000000001000",
> > >              "/rom@etc/acpi/rsdp": "0x0000000000001000"
> > >          }
> > >      },
> > >      "0000:00:03.0/virtio-mem-early (51)": {
> > >          "data": ""
> > >      }
> > > }
> > 
> > Yeah this is expected, because the whole data stream within the setup phase
> > is a black box and not described by anything.  "ram" can display correctly
> > only because it's hard coded in the python script, I think.  The trick
> > above can only make the script not break but not teaching the script to
> > also check for the early vmsd.
> 
> Note that the issue here is that the scripts stops the output after the
> virtio-mem device. So I'm not complaining about the "data": "", but about
> the vmstate according to the vmdesc not having any other devices :)
> 
> > 
> > But that's one step forward and IMHO it's fine for special devices. For
> > example, even with your initial patch, I think the static analyzer (aka,
> > qemu -dump-vmstate) will also ignore your new vmsd anyway because it's not
> > directly bound to the DeviceState* but registered separately.  So no ideal
> > solution so far, afaict, without more work done on this aspect.
> > 
> > > 
> > > 
> > > Not sure if the whole thing becomes nicer when manually looking up the
> > > vmdesc ... because filling it will also requires manually storing the
> > > se->idstr and the se->instance_id, whereby both are migration-internal and
> > > not available inside save_setup().
> > > 
> > > 
> > > Hm, I really prefer something like the simplified version that let's
> > > migration core deal with vmstate to be migrated during save_setup() phase.
> > > We could avoid the vmstate->immutable flag and simply use a separate
> > > function for registering the vmstate, like:
> > > 
> > > vmstate_register_immutable()
> > > vmstate_register_early()
> > > ...
> > 
> > I agree, this looks useful.  It's just that the devices that need this will
> > be rare, and unfortunately some of them already implemented the stream in
> > other ways so maybe non-trivial to convert them.
> 
> Right, I agree about the "rare" part and that conversion might be hard,
> because they are not using a vmstate descriptor.
> 
> The only way to avoid that is moving away from using a vmstate descriptor
> and storing everything manually in virtio-mem code. Not particularly happy
> about that, but it would be the only feasible option without touching
> migration code that I can see.
> 
> > 
> > Not against it if you want to keep exploring, but if so please avoid using
> > the priority field, also I'd hope the early vmsd will be saved within
> > qemu_savevm_state_setup() so maybe it can be another alternative to
> > save_setup().
> > 
> > Maybe it's still easier we keep going with the save_setup() and the shim
> > functions above, if it works for you.
> 
> I'll happy to go the path you suggested if we can make it work. I'd be happy
> to have something "reasonable" for the virtio-mem device in the
> analyze-migration.py output. But I could live with just nothing useful for
> the device itself -- as long as at least the other devices still show up.

So, let's see whether we can go back to the load_state() approach..

static const SaveVMHandlers vmstate_virtio_mem_device_early_ops = {
      .save_setup = virtio_mem_save_setup_early,
      .save_state = virtio_mem_save_state_early,
      .load_state = virtio_mem_load_state_early,
};

vfio handled it using a single header flag for either save_setup() or
save_state(), then load_state() identifies it:

    data = qemu_get_be64(f);
    ...
        switch (data) {
        case VFIO_MIG_FLAG_DEV_CONFIG_STATE:
        ...
        case VFIO_MIG_FLAG_DEV_SETUP_STATE:
        ...

Maybe play the same trick here?  The flag needs to be hard coded but at
least not the vmsd.  Based on previous experience, I could have missed
something else, though. :)

Or if you like go the other way I'm fine too.

Thanks,

-- 
Peter Xu




reply via email to

[Prev in Thread] Current Thread [Next in Thread]