Re: [Qemu-devel] [Bug 1207686] [NEW] qemu-1.4.0 and onwards, linux kerne

qemu-devel

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [Bug 1207686] [NEW] qemu-1.4.0 and onwards, linux kerne

From:	Stefan Hajnoczi
Subject:	Re: [Qemu-devel] [Bug 1207686] [NEW] qemu-1.4.0 and onwards, linux kernel 3.2.x, heavy I/O leads to kernel_hung_tasks_timout_secs message and unresponsive qemu-process
Date:	Fri, 2 Aug 2013 17:24:15 +0200
User-agent:	Mutt/1.5.21 (2010-09-15)

On Fri, Aug 02, 2013 at 09:58:29AM -0000, Oliver Francke wrote:
> after some testing I tried to narrow down a problem, which was initially 
> reported by some users.
> Seen on different distros - debian 7.1, ubuntu 12.04 LTS, IPFire-2.3 as 
> reported by now.
> 
> All using some flavour of linux-3.2.x kernel.
> 
> Tried e.g. under Ubuntu an upgrade to "Linux 3.8.0-27-generic x86_64" which 
> solves the problem.

Is that a guest kernel upgrade?

> Problem could be triggert with some workload ala:
> 
> spew -v --raw -P -t -i 3 -b 4k -p random -B 4k 1G /tmp/doof.dat
> and in parallel do some apt-get install/remove/whatever.
> 
> That results in a somewhat stuck qemu-session with the bad
> "kernel_hung_task..." messages.
> 
> A typical command-line is as follows:
> 
> /usr/local/qemu-1.6.0/bin/qemu-system-x86_64 -usbdevice tablet -enable-
> kvm -daemonize -pidfile /var/run/qemu-server/760.pid -monitor
> unix:/var/run/qemu-server/760.mon,server,nowait -vnc unix:/var/run/qemu-
> server/760.vnc,password -qmp unix:/var/run/qemu-
> server/760.qmp,server,nowait -nodefaults -serial none -parallel none
> -device virtio-net-pci,mac=00:F1:70:00:2F:80,netdev=vlan0d0 -netdev
> type=tap,id=vlan0d0,ifname=tap760i0d0,script=/etc/fcms/add_if.sh,downscript=/etc/fcms/downscript.sh
> -name 1155823384-4 -m 512 -vga cirrus -k de -smp sockets=1,cores=1
> -device virtio-blk-pci,drive=virtio0 -drive
> format=raw,file=rbd:1155823384/vm-760-disk-1.rbd:rbd_cache=false,cache=writeback,if=none,id=virtio0,media=disk,index=0,aio=native
> -drive
> format=raw,file=rbd:1155823384/vm-760-swap-1.rbd:rbd_cache=false,cache=writeback,if=virtio,media=disk,index=1,aio=native
> -drive if=ide,media=cdrom,id=ide1-cd0,readonly=on -drive
> if=ide,media=cdrom,id=ide1-cd1,readonly=on -boot order=dc
> 
> no "system_reset", "sendkey ctrl-alt-delete" or "q" in monitoring-
> session is accepted, need to hard-kill the process.

Yesterday I saw a possibly related report on IRC.  It was a Windows
guest running under OpenStack with images on Ceph.

They reported that the QEMU process would lock up - ping would not work
and their management tools showed 0 CPU activity for the guest.
However, they were able to "kick" the guest by taking a VNC screenshot
(I think).  Then it would come back to life.

If you have a Linux guest that is reporting kernel_hung_task, then it
could be a similar scenario.

Please confirm that the hung task message is from inside the guest.

If you are able to reproduce this and have an alternative non-Ceph
storage pool, please try that since Ceph is common to both these bug
reports.

Stefan

[Prev in Thread]

Current Thread

[Next in Thread]

[Qemu-devel] [PATCH] qapi-types.py: Fix enum struct sizes on i686, Cole Robinson, 2013/08/31

Prev by Date: Re: [Qemu-devel] [PATCH v2 15/17] raw-posix: detect XFS unwritten extents
Next by Date: Re: [Qemu-devel] [PULL 1/4] qemu-ga: build it even if !system
Previous by thread: [Qemu-devel] [PATCH v2 0/2] milkymist-uart fixes
Index(es):
- Date
- Thread