qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH v9 5/8] migration/ram.c: add a notifier chain fo


From: Peter Xu
Subject: Re: [Qemu-devel] [PATCH v9 5/8] migration/ram.c: add a notifier chain for precopy
Date: Wed, 28 Nov 2018 13:26:55 +0800
User-agent: Mutt/1.10.1 (2018-07-13)

On Tue, Nov 27, 2018 at 06:25:40PM +0800, Wei Wang wrote:
> On 11/27/2018 03:38 PM, Peter Xu wrote:
> > On Thu, Nov 15, 2018 at 06:08:01PM +0800, Wei Wang wrote:
> > > +typedef enum PrecopyNotifyReason {
> > > +    PRECOPY_NOTIFY_ERR = 0,
> > > +    PRECOPY_NOTIFY_START_ITERATION = 1,
> > > +    PRECOPY_NOTIFY_BEFORE_SYNC_BITMAP = 2,
> > > +    PRECOPY_NOTIFY_AFTER_SYNC_BITMAP = 3,
> > > +    PRECOPY_NOTIFY_MAX = 4,
> > It would be nice to add some comments for each of the notify reason.
> > E.g., from the name PRECOPY_NOTIFY_START_ITERATION seems more like a
> > hook at the start of each iteration but according to [1] it should be
> > at the start of migration rather than each iteration (or when
> > migration restarts, though I'm not sure whether we really have this
> > yet).
> 
> OK. I think It would be better if the name itself could be straightforward.
> Probably we could change PRECOPY_NOTIFY_START_ITERATION to
> PRECOPY_NOTIFY_START_MIGRATION.

Sounds good.

> 
> 
> > > +} PrecopyNotifyReason;
> > > +
> > > +void precopy_infrastructure_init(void);
> > > +void precopy_add_notifier(Notifier *n);
> > > +void precopy_remove_notifier(Notifier *n);
> > > +
> > >   void ram_mig_init(void);
> > >   void qemu_guest_free_page_hint(void *addr, size_t len);
> > > diff --git a/migration/ram.c b/migration/ram.c
> > > index 229b791..65b1223 100644
> > > --- a/migration/ram.c
> > > +++ b/migration/ram.c
> > > @@ -292,6 +292,8 @@ struct RAMState {
> > >       bool ram_bulk_stage;
> > >       /* How many times we have dirty too many pages */
> > >       int dirty_rate_high_cnt;
> > > +    /* ram save states used for notifiers */
> > > +    int ram_save_state;
> > This can be removed?
> 
> Yes, thanks.
> 
> > 
> > >       /* these variables are used for bitmap sync */
> > >       /* last time we did a full bitmap_sync */
> > >       int64_t time_last_bitmap_sync;
> > > @@ -328,6 +330,28 @@ typedef struct RAMState RAMState;
> > >   static RAMState *ram_state;
> > > +static NotifierList precopy_notifier_list;
> > > +
> > > +void precopy_infrastructure_init(void)
> > > +{
> > > +    notifier_list_init(&precopy_notifier_list);
> > > +}
> > > +
> > > +void precopy_add_notifier(Notifier *n)
> > > +{
> > > +    notifier_list_add(&precopy_notifier_list, n);
> > > +}
> > > +
> > > +void precopy_remove_notifier(Notifier *n)
> > > +{
> > > +    notifier_remove(n);
> > > +}
> > > +
> > > +static void precopy_notify(PrecopyNotifyReason reason)
> > > +{
> > > +    notifier_list_notify(&precopy_notifier_list, &reason);
> > > +}
> > > +
> > >   uint64_t ram_bytes_remaining(void)
> > >   {
> > >       return ram_state ? (ram_state->migration_dirty_pages * 
> > > TARGET_PAGE_SIZE) :
> > > @@ -1642,6 +1666,8 @@ static void migration_bitmap_sync(RAMState *rs)
> > >       int64_t end_time;
> > >       uint64_t bytes_xfer_now;
> > > +    precopy_notify(PRECOPY_NOTIFY_BEFORE_SYNC_BITMAP);
> > > +
> > >       ram_counters.dirty_sync_count++;
> > >       if (!rs->time_last_bitmap_sync) {
> > > @@ -1699,6 +1725,8 @@ static void migration_bitmap_sync(RAMState *rs)
> > >       if (migrate_use_events()) {
> > >           qapi_event_send_migration_pass(ram_counters.dirty_sync_count);
> > >       }
> > > +
> > > +    precopy_notify(PRECOPY_NOTIFY_AFTER_SYNC_BITMAP);
> > >   }
> > >   /**
> > > @@ -2555,6 +2583,8 @@ static void ram_state_reset(RAMState *rs)
> > >       rs->last_page = 0;
> > >       rs->last_version = ram_list.version;
> > >       rs->ram_bulk_stage = true;
> > > +
> > > +    precopy_notify(PRECOPY_NOTIFY_START_ITERATION);
> > [1]
> > 
> > >   }
> > >   #define MAX_WAIT 50 /* ms, half buffered_file limit */
> > > @@ -3324,6 +3354,7 @@ out:
> > >       ret = qemu_file_get_error(f);
> > >       if (ret < 0) {
> > > +        precopy_notify(PRECOPY_NOTIFY_ERR);
> > Could you show me which function is this line in?
> > 
> > Line 3324 here is ram_save_complete(), but I cannot find this exact
> > place.
> 
> Sure, it's in ram_save_iterate():
> ...
> out:
>     qemu_put_be64(f, RAM_SAVE_FLAG_EOS);
>     qemu_fflush(f);
>     ram_counters.transferred += 8;
> 
>     ret = qemu_file_get_error(f);
>     if (ret < 0) {
> +        precopy_notify(PRECOPY_NOTIFY_ERR);
>         return ret;
>     }
> 
>     return done;
> }

Ok thanks.  Please just make sure you will capture all the error
cases, e.g., I also see path like this (a few lines below):

        if (pages < 0) {
            qemu_file_set_error(f, pages);
            break;
        }

It seems that you missed that one.

I would even suggest that you capture the error with higher level.
E.g., in migration_iteration_run() after qemu_savevm_state_iterate().
Or we can just check the return value of qemu_savevm_state_iterate(),
which we have had ignored so far.

[1]

> 
> 
> > 
> > Another thing to mention about the "reasons" (though I see it more
> > like "events"): have you thought about adding a PRECOPY_NOTIFY_END?
> > It might help in some cases:
> > 
> >    - then you don't need to trickily export the migrate_postcopy()
> >      since you'll notify that before postcopy starts
> 
> I'm thinking probably we don't need to export migrate_postcopy even now.
> It's more like a sanity check, and not needed because now we have the
> notifier registered to the precopy specific callchain, which has ensured
> that
> it is invoked via precopy.

But postcopy will always start with precopy, no?

> 
> >    - you'll have a solid point that you'll 100% guarantee that we'll
> >      stop the free page hinting and don't need to worry about whether
> >      there is chance the hinting will be running without an end [2].
> 
> Thanks, I think it makes sense. Plan to add PRECOPY_NOTIFY_END in
> ram_save_complete.

Yeah you can.

Btw, if you're mostly adding the notifies only in RAM-only codes, then
you can consider add the "RAM" into the names of events too to be
clear.

My suggestion at [1] is precopy general, but you can still capture it
at the end of ram_save_iterate, then they are RAM-only again.  Please
feel free to choose what fits more...

> 
> 
> > 
> > Regarding [2] above: now the series only stops the hinting when
> > PRECOPY_NOTIFY_BEFORE_SYNC_BITMAP is notified.  Could there be a case
> > that it's missing?  E.g., what if we cancel/fail a migration during
> > precopy?  Have you tried it?
> > 
> 
> I think it has been handled by the above PRECOPY_NOTIFY_ERR

Indeed it should, as long as you're covering all the error cases.

Thanks,

-- 
Peter Xu



reply via email to

[Prev in Thread] Current Thread [Next in Thread]