qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH for-3.0 0/9] migration: postcopy recovery unit t


From: Peter Xu
Subject: Re: [Qemu-devel] [PATCH for-3.0 0/9] migration: postcopy recovery unit test, bug fixes
Date: Tue, 10 Jul 2018 11:27:25 +0800
User-agent: Mutt/1.10.0 (2018-05-17)

On Fri, Jul 06, 2018 at 11:56:59AM +0100, Dr. David Alan Gilbert wrote:
> * Dr. David Alan Gilbert (address@hidden) wrote:
> > * Peter Xu (address@hidden) wrote:
> > > Based-on: <address@hidden>
> > > 
> > > Based on the series to unbreak postcopy:
> > >   Subject: [PATCH v3 0/4] migation: unbreak postcopy recovery
> > >   Message-Id: <address@hidden>
> > > 
> > > This series introduce a new postcopy recovery test.  The new test
> > > actually helped me to identify two bugs there so fix them as well
> > > before 3.0 release.
> > > 
> > > Patch 1: a trivial cleanup for existing postcopy ram load, which I
> > >          found a bit confusing during debugging the problem.
> > > 
> > > Patch 2-3: two bug fixes that address different issues.  Please see
> > >            the commit log for more information.
> > > 
> > > Patch 4-9: add the postcopy recovery unit test.
> > > 
> > > Please review.  Thanks,
> > 
> > Queued
> 
> Hi Peter,
>   There's a problem in there somewhere;  I'm getting
> an intermittent failure of the test if I run a make check -j 8    on my
> laptop.  Just running two copies of tests/migration-test in parallel
> sometimes triggers it (but not if I turn on QTEST_LOG!).
> But it's always failing with:
> 
>   
> ERROR:/home/dgilbert/git/migpull/tests/migration-test.c:373:migrate_recover: 
> assertion failed: (qdict_haskey(rsp, "return"))

Hmm, so this should be a race.  I suspect it's because destination VM
hasn't reached the correct state when sending the recovery command.

Could you help to try these two tiny patches to see whether it can fix
the problem?

================

commit d875ea1a98932174e3fa202859b65df26def174d
Author: Peter Xu <address@hidden>
Date:   Tue Jul 10 11:17:24 2018 +0800

    migration: show pause/recover state on dst host

    These two states will be missing when doing "query-migrate" on
    destination VM.  Add these states so that we can get the query results
    as expected.

    Signed-off-by: Peter Xu <address@hidden>

diff --git a/migration/migration.c b/migration/migration.c
index 0404c53215..8d56d56930 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -911,6 +911,8 @@ static void fill_destination_migration_info(MigrationInfo 
*info)
     case MIGRATION_STATUS_CANCELLED:
     case MIGRATION_STATUS_ACTIVE:
     case MIGRATION_STATUS_POSTCOPY_ACTIVE:
+    case MIGRATION_STATUS_POSTCOPY_PAUSED:
+    case MIGRATION_STATUS_POSTCOPY_RECOVER:
     case MIGRATION_STATUS_FAILED:
     case MIGRATION_STATUS_COLO:
         info->has_status = true;

================

commit 9fa7fc773961cd0ea0b5f70a166def0d8aebf464
Author: Peter Xu <address@hidden>
Date:   Tue Jul 10 11:18:48 2018 +0800

    tests: don't send recovery cmd until dst pauses

    Signed-off-by: Peter Xu <address@hidden>

diff --git a/tests/migration-test.c b/tests/migration-test.c
index 96e69dab99..45558446f1 100644
--- a/tests/migration-test.c
+++ b/tests/migration-test.c
@@ -646,6 +646,13 @@ static void test_postcopy_recovery(void)
      */
     migrate_pause(from);

+    /*
+     * Wait for destination side to reach postcopy-paused state.  The
+     * migrate-recover command can only succeed if destination machine
+     * is in the paused state
+     */
+    wait_for_migration_status(to, "postcopy-paused");
+
     /*
      * Create a new socket to emulate a new channel that is different
      * from the broken migration channel; tell the destination to

================

Thanks!

-- 
Peter Xu



reply via email to

[Prev in Thread] Current Thread [Next in Thread]