[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] [Migration Bug? ] Occasionally, the content of VM's mem
From: |
Juan Quintela |
Subject: |
Re: [Qemu-devel] [Migration Bug? ] Occasionally, the content of VM's memory is inconsistent between Source and Destination of migration |
Date: |
Thu, 26 Mar 2015 11:29:43 +0100 |
User-agent: |
Gnus/5.13 (Gnus v5.13) Emacs/24.4 (gnu/linux) |
Wen Congyang <address@hidden> wrote:
> On 03/25/2015 05:50 PM, Juan Quintela wrote:
>> zhanghailiang <address@hidden> wrote:
>>> Hi all,
>>>
>>> We found that, sometimes, the content of VM's memory is
>>> inconsistent between Source side and Destination side
>>> when we check it just after finishing migration but before VM continue to
>>> Run.
>>>
>>> We use a patch like bellow to find this issue, you can find it from affix,
>>> and Steps to reprduce:
>>>
>>> (1) Compile QEMU:
>>> ./configure --target-list=x86_64-softmmu --extra-ldflags="-lssl" && make
>>>
>>> (2) Command and output:
>>> SRC: # x86_64-softmmu/qemu-system-x86_64 -enable-kvm -cpu
>>> qemu64,-kvmclock -netdev tap,id=hn0-device
>>> virtio-net-pci,id=net-pci0,netdev=hn0 -boot c -drive
>>> file=/mnt/sdb/pure_IMG/sles/sles11_sp3.img,if=none,id=drive-virtio-disk0,cache=unsafe
>>> -device
>>> virtio-blk-pci,bus=pci.0,addr=0x4,drive=drive-virtio-disk0,id=virtio-disk0
>>> -vnc :7 -m 2048 -smp 2 -device piix3-usb-uhci -device usb-tablet
>>> -monitor stdio
>>
>> Could you try to reproduce:
>> - without vhost
>> - without virtio-net
>> - cache=unsafe is going to give you trouble, but trouble should only
>> happen after migration of pages have finished.
>
> If I use ide disk, it doesn't happen.
> Even if I use virtio-net with vhost=on, it still doesn't happen. I guess
> it is because I migrate the guest when it is booting. The virtio net
> device is not used in this case.
Kevin, Stefan, Michael, any great idea?
Thanks, Juan.
>
> Thanks
> Wen Congyang
>
>>
>> What kind of load were you having when reproducing this issue?
>> Just to confirm, you have been able to reproduce this without COLO
>> patches, right?
>>
>>> (qemu) migrate tcp:192.168.3.8:3004
>>> before saving ram complete
>>> ff703f6889ab8701e4e040872d079a28
>>> md_host : after saving ram complete
>>> ff703f6889ab8701e4e040872d079a28
>>>
>>> DST: # x86_64-softmmu/qemu-system-x86_64 -enable-kvm -cpu
>>> qemu64,-kvmclock -netdev tap,id=hn0,vhost=on -device
>>> virtio-net-pci,id=net-pci0,netdev=hn0 -boot c -drive
>>> file=/mnt/sdb/pure_IMG/sles/sles11_sp3.img,if=none,id=drive-virtio-disk0,cache=unsafe
>>> -device
>>> virtio-blk-pci,bus=pci.0,addr=0x4,drive=drive-virtio-disk0,id=virtio-disk0
>>> -vnc :7 -m 2048 -smp 2 -device piix3-usb-uhci -device usb-tablet
>>> -monitor stdio -incoming tcp:0:3004
>>> (qemu) QEMU_VM_SECTION_END, after loading ram
>>> 230e1e68ece9cd4e769630e1bcb5ddfb
>>> md_host : after loading all vmstate
>>> 230e1e68ece9cd4e769630e1bcb5ddfb
>>> md_host : after cpu_synchronize_all_post_init
>>> 230e1e68ece9cd4e769630e1bcb5ddfb
>>>
>>> This happens occasionally, and it is more easy to reproduce when
>>> issue migration command during VM's startup time.
>>
>> OK, a couple of things. Memory don't have to be exactly identical.
>> Virtio devices in particular do funny things on "post-load". There
>> aren't warantees for that as far as I know, we should end with an
>> equivalent device state in memory.
>>
>>> We have done further test and found that some pages has been
>>> dirtied but its corresponding migration_bitmap is not set.
>>> We can't figure out which modules of QEMU has missed setting bitmap
>>> when dirty page of VM,
>>> it is very difficult for us to trace all the actions of dirtying VM's pages.
>>
>> This seems to point to a bug in one of the devices.
>>
>>> Actually, the first time we found this problem was in the COLO FT
>>> development, and it triggered some strange issues in
>>> VM which all pointed to the issue of inconsistent of VM's
>>> memory. (We have try to save all memory of VM to slave side every
>>> time
>>> when do checkpoint in COLO FT, and everything will be OK.)
>>>
>>> Is it OK for some pages that not transferred to destination when do
>>> migration ? Or is it a bug?
>>
>> Pages transferred should be the same, after device state transmission is
>> when things could change.
>>
>>> This issue has blocked our COLO development... :(
>>>
>>> Any help will be greatly appreciated!
>>
>> Later, Juan.
>>
- Re: [Qemu-devel] [Migration Bug? ] Occasionally, the content of VM's memory is inconsistent between Source and Destination of migration, (continued)
- Re: [Qemu-devel] [Migration Bug? ] Occasionally, the content of VM's memory is inconsistent between Source and Destination of migration, zhanghailiang, 2015/03/25
- Re: [Qemu-devel] [Migration Bug? ] Occasionally, the content of VM's memory is inconsistent between Source and Destination of migration, Wen Congyang, 2015/03/25
- Re: [Qemu-devel] [Migration Bug? ] Occasionally, the content of VM's memory is inconsistent between Source and Destination of migration, Li Zhijian, 2015/03/25
- Re: [Qemu-devel] [Migration Bug? ] Occasionally, the content of VM's memory is inconsistent between Source and Destination of migration, zhanghailiang, 2015/03/27
- Re: [Qemu-devel] [Migration Bug? ] Occasionally, the content of VM's memory is inconsistent between Source and Destination of migration, Dr. David Alan Gilbert, 2015/03/27
- Re: [Qemu-devel] [Migration Bug? ] Occasionally, the content of VM's memory is inconsistent between Source and Destination of migration, zhanghailiang, 2015/03/28
- Re: [Qemu-devel] [Migration Bug? ] Occasionally, the content of VM's memory is inconsistent between Source and Destination of migration, Dr. David Alan Gilbert, 2015/03/30
- Re: [Qemu-devel] [Migration Bug? ] Occasionally, the content of VM's memory is inconsistent between Source and Destination of migration, zhanghailiang, 2015/03/31
- Re: [Qemu-devel] [Migration Bug? ] Occasionally, the content of VM's memory is inconsistent between Source and Destination of migration, Juan Quintela, 2015/03/27
- Re: [Qemu-devel] [Migration Bug? ] Occasionally, the content of VM's memory is inconsistent between Source and Destination of migration, zhanghailiang, 2015/03/27
- Re: [Qemu-devel] [Migration Bug? ] Occasionally, the content of VM's memory is inconsistent between Source and Destination of migration,
Juan Quintela <=
- Re: [Qemu-devel] [Migration Bug? ] Occasionally, the content of VM's memory is inconsistent between Source and Destination of migration, Michael S. Tsirkin, 2015/03/26
- Re: [Qemu-devel] [Migration Bug? ] Occasionally, the content of VM's memory is inconsistent between Source and Destination of migration, Stefan Hajnoczi, 2015/03/27
- Re: [Qemu-devel] [Migration Bug? ] Occasionally, the content of VM's memory is inconsistent between Source and Destination of migration, Wen Congyang, 2015/03/27
- Re: [Qemu-devel] [Migration Bug? ] Occasionally, the content of VM's memory is inconsistent between Source and Destination of migration, Stefan Hajnoczi, 2015/03/27
- Re: [Qemu-devel] [Migration Bug? ] Occasionally, the content of VM's memory is inconsistent between Source and Destination of migration, Wen Congyang, 2015/03/27
- Re: [Qemu-devel] [Migration Bug? ] Occasionally, the content of VM's memory is inconsistent between Source and Destination of migration, Stefan Hajnoczi, 2015/03/27
- Re: [Qemu-devel] [Migration Bug? ] Occasionally, the content of VM's memory is inconsistent between Source and Destination of migration, Juan Quintela, 2015/03/27
- Re: [Qemu-devel] [Migration Bug? ] Occasionally, the content of VM's memory is inconsistent between Source and Destination of migration, Juan Quintela, 2015/03/27
- Re: [Qemu-devel] [Migration Bug? ] Occasionally, the content of VM's memory is inconsistent between Source and Destination of migration, Wen Congyang, 2015/03/31
- Re: [Qemu-devel] [Migration Bug? ] Occasionally, the content of VM's memory is inconsistent between Source and Destination of migration, Stefan Hajnoczi, 2015/03/31