qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH 02/15] postcopy: Transmit ram size summary word


From: Dr. David Alan Gilbert
Subject: Re: [Qemu-devel] [PATCH 02/15] postcopy: Transmit ram size summary word
Date: Wed, 25 Jan 2017 16:18:23 +0000
User-agent: Mutt/1.7.1 (2016-10-04)

* Juan Quintela (address@hidden) wrote:
> "Dr. David Alan Gilbert (git)" <address@hidden> wrote:
> > From: "Dr. David Alan Gilbert" <address@hidden>
> >
> > Replace the host page-size in the 'advise' command by a pagesize
> > summary bitmap; if the VM is just using normal RAM then
> > this will be exactly the same as before, however if they're using
> > huge pages they'll be different, and thus:
> >    a) Migration from/to old qemu's that don't understand huge pages
> >       will fail early.
> >    b) Migrations with different size RAMBlocks will also fail early.
> >
> > The previous block size check during RAM block transmission will also
> > catch these cases, but this catches a good summary and should be
> > clearer on a version mismatch.
> >
> > Signed-off-by: Dr. David Alan Gilbert <address@hidden>
> > ---
> >  include/migration/migration.h |  1 +
> >  migration/ram.c               | 17 +++++++++++++++++
> >  migration/savevm.c            | 32 +++++++++++++++++++++-----------
> >  3 files changed, 39 insertions(+), 11 deletions(-)
> 
> 
> Hi
> 
> Why it is not enough with the previous patch where we compare than blocks
> have the same page size in both ends?

They catch it a bit late, especially since we add extra data to the migration
stream; the 'advise' happens nice and early so catches it immediately.

> I assume that issuing an madvise for block with the correct page_size
> would be enough, no?

Unfortunately not; you can't explicitly switch pagesize via madvise,
and unfortunately the hugetlbfs stuff doesn't do madvise DONTNEED, so you
have to use an fallocate to do that job for that block.

Dave

> Or I am missing something obvious?
> 
> Thanks, JUan.
> 
> >
> > diff --git a/include/migration/migration.h b/include/migration/migration.h
> > index c309d23..0188bcf 100644
> > --- a/include/migration/migration.h
> > +++ b/include/migration/migration.h
> > @@ -355,6 +355,7 @@ void global_state_store_running(void);
> >  void flush_page_queue(MigrationState *ms);
> >  int ram_save_queue_pages(MigrationState *ms, const char *rbname,
> >                           ram_addr_t start, ram_addr_t len);
> > +uint64_t ram_pagesize_summary(void);
> >  
> >  PostcopyState postcopy_state_get(void);
> >  /* Set the state and return the old state */
> > diff --git a/migration/ram.c b/migration/ram.c
> > index 39998f5..40fe888 100644
> > --- a/migration/ram.c
> > +++ b/migration/ram.c
> > @@ -608,6 +608,23 @@ static void migration_bitmap_sync_init(void)
> >      iterations_prev = 0;
> >  }
> >  
> > +/* Returns a summary bitmap of the page sizes of all RAMBlocks;
> > + * for VMs with just normal pages this is equivalent to the
> > + * host page size.  If it's got some huge pages then it's the OR
> > + * of all the different page sizes.
> > + */
> > +uint64_t ram_pagesize_summary(void)
> > +{
> > +    RAMBlock *block;
> > +    uint64_t summary = 0;
> > +
> > +    QLIST_FOREACH_RCU(block, &ram_list.blocks, next) {
> > +        summary |= block->page_size;
> > +    }
> > +
> > +    return summary;
> > +}
> > +
> >  static void migration_bitmap_sync(void)
> >  {
> >      RAMBlock *block;
> > diff --git a/migration/savevm.c b/migration/savevm.c
> > index 0363372..aaae044 100644
> > --- a/migration/savevm.c
> > +++ b/migration/savevm.c
> > @@ -828,7 +828,7 @@ int qemu_savevm_send_packaged(QEMUFile *f, const 
> > uint8_t *buf, size_t len)
> >  void qemu_savevm_send_postcopy_advise(QEMUFile *f)
> >  {
> >      uint64_t tmp[2];
> > -    tmp[0] = cpu_to_be64(getpagesize());
> > +    tmp[0] = cpu_to_be64(ram_pagesize_summary());
> >      tmp[1] = cpu_to_be64(1ul << qemu_target_page_bits());
> >  
> >      trace_qemu_savevm_send_postcopy_advise();
> > @@ -1305,7 +1305,7 @@ static int qemu_loadvm_state_main(QEMUFile *f, 
> > MigrationIncomingState *mis);
> >  static int loadvm_postcopy_handle_advise(MigrationIncomingState *mis)
> >  {
> >      PostcopyState ps = postcopy_state_set(POSTCOPY_INCOMING_ADVISE);
> > -    uint64_t remote_hps, remote_tps;
> > +    uint64_t remote_pagesize_summary, local_pagesize_summary, remote_tps;
> >  
> >      trace_loadvm_postcopy_handle_advise();
> >      if (ps != POSTCOPY_INCOMING_NONE) {
> > @@ -1317,17 +1317,27 @@ static int 
> > loadvm_postcopy_handle_advise(MigrationIncomingState *mis)
> >          return -1;
> >      }
> >  
> > -    remote_hps = qemu_get_be64(mis->from_src_file);
> > -    if (remote_hps != getpagesize())  {
> > +    remote_pagesize_summary = qemu_get_be64(mis->from_src_file);
> > +    local_pagesize_summary = ram_pagesize_summary();
> > +
> > +    if (remote_pagesize_summary != local_pagesize_summary)  {
> >          /*
> > -         * Some combinations of mismatch are probably possible but it gets
> > -         * a bit more complicated.  In particular we need to place whole
> > -         * host pages on the dest at once, and we need to ensure that we
> > -         * handle dirtying to make sure we never end up sending part of
> > -         * a hostpage on it's own.
> > +         * This detects two potential causes of mismatch:
> > +         *   a) A mismatch in host page sizes
> > +         *      Some combinations of mismatch are probably possible but it 
> > gets
> > +         *      a bit more complicated.  In particular we need to place 
> > whole
> > +         *      host pages on the dest at once, and we need to ensure that 
> > we
> > +         *      handle dirtying to make sure we never end up sending part 
> > of
> > +         *      a hostpage on it's own.
> > +         *   b) The use of different huge page sizes on source/destination
> > +         *      a more fine grain test is performed during RAM block 
> > migration
> > +         *      but this test here causes a nice early clear failure, and
> > +         *      also fails when passed to an older qemu that doesn't
> > +         *      do huge pages.
> >           */
> > -        error_report("Postcopy needs matching host page sizes (s=%d d=%d)",
> > -                     (int)remote_hps, getpagesize());
> > +        error_report("Postcopy needs matching RAM page sizes (s=%" PRIx64
> > +                                                             " d=%" PRIx64 
> > ")",
> > +                     remote_pagesize_summary, local_pagesize_summary);
> >          return -1;
> >      }
--
Dr. David Alan Gilbert / address@hidden / Manchester, UK



reply via email to

[Prev in Thread] Current Thread [Next in Thread]