qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH V7 RESEND 12/17] savevm: split the process of di


From: Zhang Chen
Subject: Re: [Qemu-devel] [PATCH V7 RESEND 12/17] savevm: split the process of different stages for loadvm/savevm
Date: Sun, 3 Jun 2018 13:10:04 +0800

On Wed, May 16, 2018 at 2:56 AM, Dr. David Alan Gilbert <address@hidden
> wrote:

> * Zhang Chen (address@hidden) wrote:
> > From: zhanghailiang <address@hidden>
> >
> > There are several stages during loadvm/savevm process. In different
> stage,
> > migration incoming processes different types of sections.
> > We want to control these stages more accuracy, it will benefit COLO
> > performance, we don't have to save type of QEMU_VM_SECTION_START
> > sections everytime while do checkpoint, besides, we want to separate
> > the process of saving/loading memory and devices state.
> >
> > So we add three new helper functions: qemu_load_device_state() and
> > qemu_savevm_live_state() to achieve different process during migration.
> >
> > Besides, we make qemu_loadvm_state_main() and qemu_save_device_state()
> > public, and simplify the codes of qemu_save_device_state() by calling the
> > wrapper qemu_savevm_state_header().
> >
> > Signed-off-by: zhanghailiang <address@hidden>
> > Signed-off-by: Li Zhijian <address@hidden>
> > Signed-off-by: Zhang Chen <address@hidden>
> > Reviewed-by: Dr. David Alan Gilbert <address@hidden>
> > ---
> >  migration/colo.c   | 36 ++++++++++++++++++++++++++++--------
> >  migration/savevm.c | 35 ++++++++++++++++++++++++++++-------
> >  migration/savevm.h |  4 ++++
> >  3 files changed, 60 insertions(+), 15 deletions(-)
> >
> > diff --git a/migration/colo.c b/migration/colo.c
> > index cdff0a2490..5b055f79f1 100644
> > --- a/migration/colo.c
> > +++ b/migration/colo.c
> > @@ -30,6 +30,7 @@
> >  #include "block/block.h"
> >  #include "qapi/qapi-events-migration.h"
> >  #include "qapi/qmp/qerror.h"
> > +#include "sysemu/cpus.h"
> >
> >  static bool vmstate_loading;
> >  static Notifier packets_compare_notifier;
> > @@ -414,23 +415,30 @@ static int 
> > colo_do_checkpoint_transaction(MigrationState
> *s,
> >
> >      /* Disable block migration */
> >      migrate_set_block_enabled(false, &local_err);
> > -    qemu_savevm_state_header(fb);
> > -    qemu_savevm_state_setup(fb);
> >      qemu_mutex_lock_iothread();
> >      replication_do_checkpoint_all(&local_err);
> >      if (local_err) {
> >          qemu_mutex_unlock_iothread();
> >          goto out;
> >      }
> > -    qemu_savevm_state_complete_precopy(fb, false, false);
> > -    qemu_mutex_unlock_iothread();
> > -
> > -    qemu_fflush(fb);
> >
> >      colo_send_message(s->to_dst_file, COLO_MESSAGE_VMSTATE_SEND,
> &local_err);
> >      if (local_err) {
> >          goto out;
> >      }
> > +    /*
> > +     * Only save VM's live state, which not including device state.
> > +     * TODO: We may need a timeout mechanism to prevent COLO process
> > +     * to be blocked here.
> > +     */
>
> I guess that's the downside to transmitting it directly than into the
> buffer;
> Peter Xu's OOB command system would let you kill the connection - and
> that's something I think COLO should use.
> Still the change saves you having that huge outgoing buffer on the
> source side and lets you start sending the checkpoint sooner, which
> means the pause time should be smaller.
>

Yes, you are right.
But I think this is a performance optimization, this series focus on
enabling.
I will do this job in the future.


>
> > +    qemu_savevm_live_state(s->to_dst_file);
>
> Does this actually need to be inside of the qemu_mutex_lock_iothread?
> I'm pretty sure the device_state needs to be, but I'm not sure the
> live_state needs to.
>

I have checked the codes, qemu_savevm_live_state needn't inside of the
qemu_mutex_lock_iothread,
I will move the it out the lock area in next version.



>
> > +    /* Note: device state is saved into buffer */
> > +    ret = qemu_save_device_state(fb);
> > +
> > +    qemu_mutex_unlock_iothread();
> > +
> > +    qemu_fflush(fb);
> > +
> >      /*
> >       * We need the size of the VMstate data in Secondary side,
> >       * With which we can decide how much data should be read.
> > @@ -643,6 +651,7 @@ void *colo_process_incoming_thread(void *opaque)
> >      uint64_t total_size;
> >      uint64_t value;
> >      Error *local_err = NULL;
> > +    int ret;
> >
> >      qemu_sem_init(&mis->colo_incoming_sem, 0);
> >
> > @@ -715,6 +724,16 @@ void *colo_process_incoming_thread(void *opaque)
> >              goto out;
> >          }
> >
> > +        qemu_mutex_lock_iothread();
> > +        cpu_synchronize_all_pre_loadvm();
> > +        ret = qemu_loadvm_state_main(mis->from_src_file, mis);
> > +        qemu_mutex_unlock_iothread();
> > +
> > +        if (ret < 0) {
> > +            error_report("Load VM's live state (ram) error");
> > +            goto out;
> > +        }
> > +
> >          value = colo_receive_message_value(mis->from_src_file,
> >                                   COLO_MESSAGE_VMSTATE_SIZE, &local_err);
> >          if (local_err) {
> > @@ -748,8 +767,9 @@ void *colo_process_incoming_thread(void *opaque)
> >          qemu_mutex_lock_iothread();
> >          qemu_system_reset(SHUTDOWN_CAUSE_NONE);
>
> Is the reset safe? Are you sure it doesn't change the ram you've just
> loaded?
>

Yes, It is safe. In my test the secondary node running well.


>
> >          vmstate_loading = true;
> > -        if (qemu_loadvm_state(fb) < 0) {
> > -            error_report("COLO: loadvm failed");
> > +        ret = qemu_load_device_state(fb);
> > +        if (ret < 0) {
> > +            error_report("COLO: load device state failed");
> >              qemu_mutex_unlock_iothread();
> >              goto out;
> >          }
> > diff --git a/migration/savevm.c b/migration/savevm.c
> > index ec0bff09ce..0f61239429 100644
> > --- a/migration/savevm.c
> > +++ b/migration/savevm.c
> > @@ -1332,13 +1332,20 @@ done:
> >      return ret;
> >  }
> >
> > -static int qemu_save_device_state(QEMUFile *f)
> > +void qemu_savevm_live_state(QEMUFile *f)
> >  {
> > -    SaveStateEntry *se;
> > +    /* save QEMU_VM_SECTION_END section */
> > +    qemu_savevm_state_complete_precopy(f, true, false);
> > +    qemu_put_byte(f, QEMU_VM_EOF);
> > +}
> >
> > -    qemu_put_be32(f, QEMU_VM_FILE_MAGIC);
> > -    qemu_put_be32(f, QEMU_VM_FILE_VERSION);
> > +int qemu_save_device_state(QEMUFile *f)
> > +{
> > +    SaveStateEntry *se;
> >
> > +    if (!migration_in_colo_state()) {
> > +        qemu_savevm_state_header(f);
> > +    }
> >      cpu_synchronize_all_states();
>
> So this changes qemu_save_device_state to use savevm_state_header
> which feels reasonable, but that includes the 'configuration'
> section;  do we want that?  Is that OK for Xen's use in
> qmp_xen_save_devices_state?
>

If I understand correctly, COLO Xen don't use qemu migration codes do this
job currently,
COLO Xen have an independent COLO frame do same job(use Xen migration
codes),
So I think it is OK for Xen.


Thanks your review and continued support.
Zhang Chen


>
> >      QTAILQ_FOREACH(se, &savevm_state.handlers, entry) {
> > @@ -1394,8 +1401,6 @@ enum LoadVMExitCodes {
> >      LOADVM_QUIT     =  1,
> >  };
> >
> > -static int qemu_loadvm_state_main(QEMUFile *f, MigrationIncomingState
> *mis);
> > -
> >  /* ------ incoming postcopy messages ------ */
> >  /* 'advise' arrives before any transfers just to tell us that a postcopy
> >   * *might* happen - it might be skipped if precopy transferred
> everything
> > @@ -2075,7 +2080,7 @@ void qemu_loadvm_state_cleanup(void)
> >      }
> >  }
> >
> > -static int qemu_loadvm_state_main(QEMUFile *f, MigrationIncomingState
> *mis)
> > +int qemu_loadvm_state_main(QEMUFile *f, MigrationIncomingState *mis)
> >  {
> >      uint8_t section_type;
> >      int ret = 0;
> > @@ -2229,6 +2234,22 @@ int qemu_loadvm_state(QEMUFile *f)
> >      return ret;
> >  }
> >
> > +int qemu_load_device_state(QEMUFile *f)
> > +{
> > +    MigrationIncomingState *mis = migration_incoming_get_current();
> > +    int ret;
> > +
> > +    /* Load QEMU_VM_SECTION_FULL section */
> > +    ret = qemu_loadvm_state_main(f, mis);
> > +    if (ret < 0) {
> > +        error_report("Failed to load device state: %d", ret);
> > +        return ret;
> > +    }
> > +
> > +    cpu_synchronize_all_post_init();
> > +    return 0;
> > +}
> > +
> >  int save_snapshot(const char *name, Error **errp)
> >  {
> >      BlockDriverState *bs, *bs1;
> > diff --git a/migration/savevm.h b/migration/savevm.h
> > index c6d46b37a2..cf7935dd68 100644
> > --- a/migration/savevm.h
> > +++ b/migration/savevm.h
> > @@ -53,8 +53,12 @@ void qemu_savevm_send_postcopy_ram_discard(QEMUFile
> *f, const char *name,
> >                                             uint64_t *start_list,
> >                                             uint64_t *length_list);
> >  void qemu_savevm_send_colo_enable(QEMUFile *f);
> > +void qemu_savevm_live_state(QEMUFile *f);
> > +int qemu_save_device_state(QEMUFile *f);
> >
> >  int qemu_loadvm_state(QEMUFile *f);
> >  void qemu_loadvm_state_cleanup(void);
> > +int qemu_loadvm_state_main(QEMUFile *f, MigrationIncomingState *mis);
> > +int qemu_load_device_state(QEMUFile *f);
> >
> >  #endif
> > --
> > 2.17.0
>
> Dave
>
> >
> --
> Dr. David Alan Gilbert / address@hidden / Manchester, UK
>


reply via email to

[Prev in Thread] Current Thread [Next in Thread]