[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] post-copy is broken?
From: |
Dr. David Alan Gilbert |
Subject: |
Re: [Qemu-devel] post-copy is broken? |
Date: |
Wed, 20 Apr 2016 18:27:55 +0100 |
User-agent: |
Mutt/1.5.24 (2015-08-30) |
Hi,
Just a follow up with a little more debug;
I modified the test so it doesn't quit after the first miscomparison (see
diff below), and looking on the failures on real hardware I've seen:
/x86_64/postcopy: Memory content inconsistency at 3800000 first_byte = 30
last_byte = 30 current = 10 hit_edge = 0
Memory content inconsistency at 38fe000 first_byte = 30
last_byte = 10 current = 30 hit_edge = 0
and then another time:
/x86_64/postcopy: Memory content inconsistency at 4c00000 first_byte = 9a
last_byte = 99 current = 1 hit_edge = 1
Memory content inconsistency at 4cec000 first_byte = 9a
last_byte = 1 current = 99 hit_edge = 1
so in both cases what we're seeing there is starting on a 2M page boundary, a
page
that is read on the destination as zero instead of getting the migrated value -
but somewhere later in the page it starts behaving. (in the first example the
counter
had reached 0x30 - except for those pages which hadn't been transferred where
the counter is much lower at 0x10).
Testing it in my VM, I added some debug for where I'd been doing an madvise
DONTNEED
previously:
ram_discard_range: pc.ram:0xf51000 for 42094592
ram_discard_range: pc.ram:0x5259000 for 18509824
Memory content inconsistency at f51000 first_byte = 6d last_byte = 6d current =
9e hit_edge = 0
Memory content inconsistency at 1000000 first_byte = 6d last_byte = 9e current
= 6d hit_edge = 0
So that's saying that from f51000..1000000 it was wrong - so not just one
page, but upto the THP edge.
(It then got back to the right value - 6d - on the page edge). Note how the
start corresponds
to the address I'd previously done a discard on, but not the whole discard
range - just
upto the THP page boundary. Nothing in my userspace code knows about THP
(other than turning it off).
Dave
@@ -251,6 +251,7 @@ static void check_guests_ram(void)
uint8_t first_byte;
uint8_t last_byte;
bool hit_edge = false;
+ bool bad = false;
qtest_memread(global_qtest, start_address, &first_byte, 1);
last_byte = first_byte;
@@ -271,11 +272,12 @@ static void check_guests_ram(void)
" first_byte = %x last_byte = %x current = %x"
" hit_edge = %x\n",
address, first_byte, last_byte, b, hit_edge);
- assert(0);
+ bad = true;
}
}
last_byte = b;
}
+ assert(!bad);
fprintf(stderr, "first_byte = %x last_byte = %x hit_edge = %x OK\n",
first_byte, last_byte, hit_edge);
}
--
Dr. David Alan Gilbert / address@hidden / Manchester, UK
- Re: [Qemu-devel] post-copy is broken?, (continued)
- Re: [Qemu-devel] post-copy is broken?, Dr. David Alan Gilbert, 2016/04/15
- Re: [Qemu-devel] post-copy is broken?, Kirill A. Shutemov, 2016/04/15
- Re: [Qemu-devel] post-copy is broken?, Dr. David Alan Gilbert, 2016/04/15
- Re: [Qemu-devel] post-copy is broken?, Li, Liang Z, 2016/04/18
- Re: [Qemu-devel] post-copy is broken?, Dr. David Alan Gilbert, 2016/04/18
- Re: [Qemu-devel] post-copy is broken?, Li, Liang Z, 2016/04/18
- Re: [Qemu-devel] post-copy is broken?, Dr. David Alan Gilbert, 2016/04/18
- Re: [Qemu-devel] post-copy is broken?, Li, Liang Z, 2016/04/18
- Re: [Qemu-devel] post-copy is broken?, Dr. David Alan Gilbert, 2016/04/18
- Re: [Qemu-devel] post-copy is broken?, Dr. David Alan Gilbert, 2016/04/18
- Re: [Qemu-devel] post-copy is broken?,
Dr. David Alan Gilbert <=
- Re: [Qemu-devel] post-copy is broken?, Dr. David Alan Gilbert, 2016/04/21
- Re: [Qemu-devel] post-copy is broken?, Andrea Arcangeli, 2016/04/27
- Re: [Qemu-devel] post-copy is broken?, Li, Liang Z, 2016/04/27
- Re: [Qemu-devel] post-copy is broken?, Dr. David Alan Gilbert, 2016/04/28
- Re: [Qemu-devel] post-copy is broken?, Andrea Arcangeli, 2016/04/15
- Re: [Qemu-devel] post-copy is broken?, Dr. David Alan Gilbert, 2016/04/18
- Re: [Qemu-devel] post-copy is broken?, Li, Liang Z, 2016/04/18