qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] post-copy is broken?


From: Dr. David Alan Gilbert
Subject: Re: [Qemu-devel] post-copy is broken?
Date: Wed, 20 Apr 2016 18:27:55 +0100
User-agent: Mutt/1.5.24 (2015-08-30)

Hi,
  Just a follow up with a little more debug;

I modified the test so it doesn't quit after the first miscomparison (see
diff below), and looking on the failures on real hardware I've seen:

/x86_64/postcopy: Memory content inconsistency at 3800000 first_byte = 30 
last_byte = 30 current = 10 hit_edge = 0
                  Memory content inconsistency at 38fe000 first_byte = 30 
last_byte = 10 current = 30 hit_edge = 0

and then another time:
/x86_64/postcopy: Memory content inconsistency at 4c00000 first_byte = 9a 
last_byte = 99 current = 1 hit_edge = 1
                  Memory content inconsistency at 4cec000 first_byte = 9a 
last_byte = 1 current = 99 hit_edge = 1

so in both cases what we're seeing there is starting on a 2M page boundary, a 
page
that is read on the destination as zero instead of getting the migrated value -
but somewhere later in the page it starts behaving. (in the first example the 
counter
had reached 0x30 - except for those pages which hadn't been transferred where
the counter is much lower at 0x10).

Testing it in my VM, I added some debug for where I'd been doing an madvise 
DONTNEED
previously:

ram_discard_range: pc.ram:0xf51000 for 42094592
ram_discard_range: pc.ram:0x5259000 for 18509824
Memory content inconsistency at f51000 first_byte = 6d last_byte = 6d current = 
9e hit_edge = 0
Memory content inconsistency at 1000000 first_byte = 6d last_byte = 9e current 
= 6d hit_edge = 0

   So that's saying that from f51000..1000000 it was wrong - so not just one 
page, but upto the THP edge.
(It then got back to the right value - 6d - on the page edge).  Note how the 
start corresponds
to the address I'd previously done a discard on, but not the whole discard 
range - just
upto the THP page boundary.  Nothing in my userspace code knows about THP
(other than turning it off).

Dave



@@ -251,6 +251,7 @@ static void check_guests_ram(void)
     uint8_t first_byte;
     uint8_t last_byte;
     bool hit_edge = false;
+    bool bad = false;
 
     qtest_memread(global_qtest, start_address, &first_byte, 1);
     last_byte = first_byte;
@@ -271,11 +272,12 @@ static void check_guests_ram(void)
                                 " first_byte = %x last_byte = %x current = %x"
                                 " hit_edge = %x\n",
                                 address, first_byte, last_byte, b, hit_edge);
-                assert(0);
+                bad = true;
             }
         }
         last_byte = b;
     }
+    assert(!bad);
     fprintf(stderr, "first_byte = %x last_byte = %x hit_edge = %x OK\n",
                     first_byte, last_byte, hit_edge);
 }

--
Dr. David Alan Gilbert / address@hidden / Manchester, UK



reply via email to

[Prev in Thread] Current Thread [Next in Thread]