Re: [Qemu-devel] [RFC v2 10/33] migration: allow dst vm pause on postcop

qemu-devel

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [RFC v2 10/33] migration: allow dst vm pause on postcop

From:	Dr. David Alan Gilbert
Subject:	Re: [Qemu-devel] [RFC v2 10/33] migration: allow dst vm pause on postcopy
Date:	Mon, 9 Oct 2017 19:58:13 +0100
User-agent:	Mutt/1.9.1 (2017-09-22)

* Peter Xu (address@hidden) wrote:
> On Thu, Sep 21, 2017 at 08:29:03PM +0100, Dr. David Alan Gilbert wrote:
> > * Peter Xu (address@hidden) wrote:
> > > When there is IO error on the incoming channel (e.g., network down),
> > > instead of bailing out immediately, we allow the dst vm to switch to the
> > > new POSTCOPY_PAUSE state. Currently it is still simple - it waits the
> > > new semaphore, until someone poke it for another attempt.
> > > 
> > > Signed-off-by: Peter Xu <address@hidden>
> > > ---
> > >  migration/migration.c  |  1 +
> > >  migration/migration.h  |  3 +++
> > >  migration/savevm.c     | 60 
> > > ++++++++++++++++++++++++++++++++++++++++++++++++--
> > >  migration/trace-events |  2 ++
> > >  4 files changed, 64 insertions(+), 2 deletions(-)
> > > 
> > > diff --git a/migration/migration.c b/migration/migration.c
> > > index 8d26ea8..80de212 100644
> > > --- a/migration/migration.c
> > > +++ b/migration/migration.c
> > > @@ -146,6 +146,7 @@ MigrationIncomingState 
> > > *migration_incoming_get_current(void)
> > >          memset(&mis_current, 0, sizeof(MigrationIncomingState));
> > >          qemu_mutex_init(&mis_current.rp_mutex);
> > >          qemu_event_init(&mis_current.main_thread_load_event, false);
> > > +        qemu_sem_init(&mis_current.postcopy_pause_sem_dst, 0);
> > >          once = true;
> > >      }
> > >      return &mis_current;
> > > diff --git a/migration/migration.h b/migration/migration.h
> > > index 0c957c9..c423682 100644
> > > --- a/migration/migration.h
> > > +++ b/migration/migration.h
> > > @@ -60,6 +60,9 @@ struct MigrationIncomingState {
> > >      /* The coroutine we should enter (back) after failover */
> > >      Coroutine *migration_incoming_co;
> > >      QemuSemaphore colo_incoming_sem;
> > > +
> > > +    /* notify PAUSED postcopy incoming migrations to try to continue */
> > > +    QemuSemaphore postcopy_pause_sem_dst;
> > >  };
> > >  
> > >  MigrationIncomingState *migration_incoming_get_current(void);
> > > diff --git a/migration/savevm.c b/migration/savevm.c
> > > index 7172f14..3777124 100644
> > > --- a/migration/savevm.c
> > > +++ b/migration/savevm.c
> > > @@ -1488,8 +1488,8 @@ static int 
> > > loadvm_postcopy_ram_handle_discard(MigrationIncomingState *mis,
> > >   */
> > >  static void *postcopy_ram_listen_thread(void *opaque)
> > >  {
> > > -    QEMUFile *f = opaque;
> > >      MigrationIncomingState *mis = migration_incoming_get_current();
> > > +    QEMUFile *f = mis->from_src_file;
> > >      int load_res;
> > >  
> > >      migrate_set_state(&mis->state, MIGRATION_STATUS_ACTIVE,
> > > @@ -1503,6 +1503,14 @@ static void *postcopy_ram_listen_thread(void 
> > > *opaque)
> > >       */
> > >      qemu_file_set_blocking(f, true);
> > >      load_res = qemu_loadvm_state_main(f, mis);
> > > +
> > > +    /*
> > > +     * This is tricky, but, mis->from_src_file can change after it
> > > +     * returns, when postcopy recovery happened. In the future, we may
> > > +     * want a wrapper for the QEMUFile handle.
> > > +     */
> > > +    f = mis->from_src_file;
> > > +
> > >      /* And non-blocking again so we don't block in any cleanup */
> > >      qemu_file_set_blocking(f, false);
> > >  
> > > @@ -1581,7 +1589,7 @@ static int 
> > > loadvm_postcopy_handle_listen(MigrationIncomingState *mis)
> > >      /* Start up the listening thread and wait for it to signal ready */
> > >      qemu_sem_init(&mis->listen_thread_sem, 0);
> > >      qemu_thread_create(&mis->listen_thread, "postcopy/listen",
> > > -                       postcopy_ram_listen_thread, mis->from_src_file,
> > > +                       postcopy_ram_listen_thread, NULL,
> > >                         QEMU_THREAD_DETACHED);
> > >      qemu_sem_wait(&mis->listen_thread_sem);
> > >      qemu_sem_destroy(&mis->listen_thread_sem);
> > > @@ -1966,11 +1974,44 @@ void qemu_loadvm_state_cleanup(void)
> > >      }
> > >  }
> > >  
> > > +/* Return true if we should continue the migration, or false. */
> > > +static bool postcopy_pause_incoming(MigrationIncomingState *mis)
> > > +{
> > > +    trace_postcopy_pause_incoming();
> > > +
> > > +    migrate_set_state(&mis->state, MIGRATION_STATUS_POSTCOPY_ACTIVE,
> > > +                      MIGRATION_STATUS_POSTCOPY_PAUSED);
> > > +
> > > +    assert(mis->from_src_file);
> > > +    qemu_file_shutdown(mis->from_src_file);
> > > +    qemu_fclose(mis->from_src_file);
> > > +    mis->from_src_file = NULL;
> > > +
> > > +    assert(mis->to_src_file);
> > > +    qemu_mutex_lock(&mis->rp_mutex);
> > > +    qemu_file_shutdown(mis->to_src_file);
> > 
> > Should you not do the shutdown() before the lock?
> > For example if the other thread is stuck, with rp_mutex
> > held, trying to write to to_src_file, then you'll block
> > waiting for the mutex.  If you call shutdown and then take
> > the lock, the other thread will error and release the lock.
> 
> The problem is that IMHO QEMUFile is not yet thread-safe itself.  So
> if we operate on it (even to shut it down) logically we need to have
> the lock, right?

That probably needs fixing for 'shutdown' under the assumption that
   a) No one has or is deleting/freeing the QEMUFile
   b) No one is closing the QEMUFile

The whole point of using shutdown() is it forces any stuck send()'s or
read()'s to fail rather than staying stuck.

> Then, IMHO the question would be: when will the send() be stuck in the
> other thread?
> 
> Normally the only case I can think of is that source didn't recv()
> fast enough, and we even consumed all the write buffer in dst side (I
> don't really know how kernel manages the buffers though, and e.g. how
> the size of buffer is defined...).
> 
> But when reach here, the channel (say, from_src_file and to_src_file,
> since both of them are using the same channel behind the QEMUFile
> interface) should already be broken in some way, then IIUC even there
> is a send() in the other thread, it should return at some point with a
> failure as well, just like how we reached here (possibly due to a
> read() failure).

We have to be careful about this; a network can fail in a way it
gets stuck rather than fails - this can get stuck until a full TCP
disconnection; and that takes about 30mins (from memory).
The nice thing about using 'shutdown' is that you can kill the existing
connection if it's hung. (Which then makes an interesting question;
the rules in your migrate-incoming command become different if you
want to declare it's failed!).  Having said that, you're right that at
this point stuff has already failed - so do we need the shutdown?
(You might want to do the shutdown as part of the recovery earlier
or as a separate command to force the failure)

> > I'm not quite sure what will happen if we end up calling this
> > before the main thread has been returned from postcopy and the
> > device loading is complete.
> 
> IIUC you mean the time starts from when we got MIG_CMD_PACKAGED until
> main thread finishes handling that package?

Yes.

> Normally I think that should not matter much since during handling the
> package it should hardly fail (we were reading from a buffer QIO
> channel, no real IOs there)... 

Note that while the main thread is reading the package, the listener
thread is receiving pages, so you can legally get a failure at that
point when the fd fails as it's receiving pages at the same time
as reading the devices.
(There's an argument that if it fails before you've received all
your devices then perhaps you can just restart the source)

> But I agree about the reasoning.  How
> about one more patch to postpone the "active" to "postcopy-active"
> state change after the package is handled correctly?  Like:
> 
> --------------
> diff --git a/migration/savevm.c b/migration/savevm.c                     
> index b5c3214034..8317b2a7e2 100644 
> --- a/migration/savevm.c            
> +++ b/migration/savevm.c            
> @@ -1573,8 +1573,6 @@ static void *postcopy_ram_listen_thread(void *opaque)   
>                                                                     
>      QEMUFile *f = mis->from_src_file;                                   
>      int load_res;                  
>                                     
> -    migrate_set_state(&mis->state, MIGRATION_STATUS_ACTIVE,             
> -                                   MIGRATION_STATUS_POSTCOPY_ACTIVE);   
>      qemu_sem_post(&mis->listen_thread_sem);                             
>      trace_postcopy_ram_listen_thread_start();                           
>                                     
> @@ -1817,6 +1815,9 @@ static int 
> loadvm_handle_cmd_packaged(MigrationIncomingState *mis)                       
>                                    
>      qemu_fclose(packf);            
>      object_unref(OBJECT(bioc));    
>                                     
> +    migrate_set_state(&mis->state, MIGRATION_STATUS_ACTIVE,             
> +                                   MIGRATION_STATUS_POSTCOPY_ACTIVE);   
> +                                   
>      return ret;                    
>  }                                  
> --------------
> 
> This function will only be called with "postcopy-active" state.

I *think* that's safe; you've got to be careful, but I can't see
anyone on the destination that cares about the destinction.

> > Also, at this point have we guaranteed no one else is about
> > to do an op on mis->to_src_file and will seg?
> 
> I think no?  Since IMHO the main thread is playing with the buffer QIO
> channel, rather than the real one?

OK.

> (btw, could I ask what's "seg"? :)

just short for segmentation fault; sig 11.

Dave

> -- 
> Peter Xu
--
Dr. David Alan Gilbert / address@hidden / Manchester, UK

[Prev in Thread]

Current Thread

[Next in Thread]

Re: [Qemu-devel] [RFC v2 10/33] migration: allow dst vm pause on postcopy, Dr. David Alan Gilbert <=
- Re: [Qemu-devel] [RFC v2 10/33] migration: allow dst vm pause on postcopy, Peter Xu, 2017/10/10
  - Re: [Qemu-devel] [RFC v2 10/33] migration: allow dst vm pause on postcopy, Peter Xu, 2017/10/10
    - Re: [Qemu-devel] [RFC v2 10/33] migration: allow dst vm pause on postcopy, Dr. David Alan Gilbert, 2017/10/31
  - Re: [Qemu-devel] [RFC v2 10/33] migration: allow dst vm pause on postcopy, Dr. David Alan Gilbert, 2017/10/10
    - Re: [Qemu-devel] [RFC v2 10/33] migration: allow dst vm pause on postcopy, Peter Xu, 2017/10/10
    - Re: [Qemu-devel] [RFC v2 10/33] migration: allow dst vm pause on postcopy, Dr. David Alan Gilbert, 2017/10/12
    - Re: [Qemu-devel] [RFC v2 10/33] migration: allow dst vm pause on postcopy, Peter Xu, 2017/10/13

Prev by Date: Re: [Qemu-devel] [PATCH v4 0/8] xen: xen-domid-restrict improvements
Next by Date: [Qemu-devel] [Bug 1719282] Re: Unable to boot after drive-mirror
Previous by thread: Re: [Qemu-devel] [PATCH 02/18] block: BDS deletion during bdrv_drain_recurse
Next by thread: Re: [Qemu-devel] [RFC v2 10/33] migration: allow dst vm pause on postcopy
Index(es):
- Date
- Thread