[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] Hang with migration multi-thread compression under high
Daniel P. Berrange
Re: [Qemu-devel] Hang with migration multi-thread compression under high load
Thu, 28 Apr 2016 09:22:03 +0100
On Thu, Apr 28, 2016 at 03:27:39AM +0000, Li, Liang Z wrote:
> > I've been testing various features of migration and have hit a problem with
> > the multi-thread compression. It works fine when I have 2 or more threads,
> > but if I tell it to only use a single thread, then it almost always hangs
> > I'm doing a migration between 2 guests on the same machine over a tcp
> > localhost socket, using this command line to launch them:
> > /home/berrange/src/virt/qemu/x86_64-softmmu/qemu-system-x86_64
> > -chardev socket,id=mon,path=/var/tmp/qemu-src-4644-monitor.sock
> > -mon chardev=mon,mode=control
> > -display none
> > -vga none
> > -machine accel=kvm
> > -kernel /boot/vmlinuz-4.4.7-300.fc23.x86_64
> > -initrd /home/berrange/src/virt/qemu/tests/migration/initrd-stress.img
> > -append "noapic edd=off printk.time=1 noreplace-smp
> > cgroup_disable=memory pci=noearly console=ttyS0 debug ramsize=1"
> > -chardev stdio,id=cdev0
> > -device isa-serial,chardev=cdev0
> > -m 1536
> > -smp 1
> > The target VM also gets
> > -incoming tcp:localhost:9000
> > When the VM hangs, the source QEMU shows this stack trace:
> What's the mean of "VM hangs", the VM has no response?
> or just the live migration process can't not complete.
The live migration process stops transferring any data, and the
monitor on the target host stops responding to input, because
the main thread is stuck in the the decompress_data_with_multi_threads
> I do the test in my environment, it works for me.
NB, to make it more likely to happen you need to have a highly
loaded host - if the host is mostly idle it seems to work fine.
> Could you try to exec 'info migrate' in qemu monitor on the source side
> to check if the live migration process is ongoing, if the 'transferred ram'
> keeps unchanged, it shows dad lock happen.
The migration status is "active" and the transferred RAM is stuck
at approx 3-4 MB, not making any progress. As mentioned in the
description, the source QEMU is stuck in a blocking sendmsg() as
the TCP recv buffer is full on the target.
|: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org -o- http://virt-manager.org :|
|: http://autobuild.org -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :|