qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH v9 5/8] migration/ram.c: add a notifier chain fo


From: Peter Xu
Subject: Re: [Qemu-devel] [PATCH v9 5/8] migration/ram.c: add a notifier chain for precopy
Date: Wed, 28 Nov 2018 17:32:20 +0800
User-agent: Mutt/1.10.1 (2018-07-13)

On Wed, Nov 28, 2018 at 05:01:31PM +0800, Wei Wang wrote:
> On 11/28/2018 01:26 PM, Peter Xu wrote:
> > 
> > Ok thanks.  Please just make sure you will capture all the error
> > cases, e.g., I also see path like this (a few lines below):
> > 
> >          if (pages < 0) {
> >              qemu_file_set_error(f, pages);
> >              break;
> >          }
> > 
> > It seems that you missed that one.
> 
> I think that one should be fine. This notification is actually put at the
> bottom of ram_save_iterate. All the above error will bail out to the "out:"
> path and then go to call precopy_notify(PRECOPY_NOTIFY_ERR).

Ok, maybe I was pointing to a wrong one. :)

> 
> > 
> > I would even suggest that you capture the error with higher level.
> > E.g., in migration_iteration_run() after qemu_savevm_state_iterate().
> > Or we can just check the return value of qemu_savevm_state_iterate(),
> > which we have had ignored so far.
> 
> Not very sure about the higher level, because other SaveStateEntry may cause
> errors that this feature don't need to care, I think we may only need it in
> ram_save.

So what I am worrying here are corner cases where we might forget to
stop the hinting.  I'm fabricating one example sequence of events:

  (start migration)
  START_MIGRATION
  BEFORE_SYNC
  AFTER_SYNC
  ...
  BEFORE_SYNC
  AFTER_SYNC
  (some SaveStateEntry failed rather than RAM, then
   migration_detect_error returned MIG_THR_ERR_FATAL so we need to
   fail the migration, however when running the previous
   ram_save_iterate for RAM's specific SaveStateEntry we didn't see
   any error so no ERROR event detected)

Then it seems the hinting will last forever.  Considering that now I'm
not sure whether this can be done ram-only, since even if you capture
ram_save_complete() and at the same time you introduce PRECOPY_END you
may still miss the PRECOPY_END event since AFAIU ram_save_complete()
won't be called at all in this case.

Could this happen?

> 
> 
> > [1]
> > 
> > > 
> > > > Another thing to mention about the "reasons" (though I see it more
> > > > like "events"): have you thought about adding a PRECOPY_NOTIFY_END?
> > > > It might help in some cases:
> > > > 
> > > >     - then you don't need to trickily export the migrate_postcopy()
> > > >       since you'll notify that before postcopy starts
> > > I'm thinking probably we don't need to export migrate_postcopy even now.
> > > It's more like a sanity check, and not needed because now we have the
> > > notifier registered to the precopy specific callchain, which has ensured
> > > that
> > > it is invoked via precopy.
> > But postcopy will always start with precopy, no?
> 
> Yes, but I think we could add the check in precopy_notify()

I'm not sure that's good.  If the notifier could potentially have
other user, they might still work with postcopy, and they might expect
e.g. BEFORE_SYNC to be called for every sync, even if it's at the
precopy stage of a postcopy.  In that sense I still feel the
PRECOPY_END is better (so contantly call it at the end of precopy, no
matter whether there's another postcopy afterwards).  It sounds like a
cleaner interface.

Or you can check it in the balloon specific callback and ignore the
event if postcopy is on, but then we're going backward to need to
export the API so it seems meaningless.

Regards,

-- 
Peter Xu



reply via email to

[Prev in Thread] Current Thread [Next in Thread]