qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] A second bug in the IO throttling code


From: Stefan Hajnoczi
Subject: Re: [Qemu-devel] A second bug in the IO throttling code
Date: Mon, 19 Mar 2012 16:05:04 +0000

On Sun, Mar 18, 2012 at 2:40 PM, Chris Webb <address@hidden> wrote:
> Whilst you have patches in progress for the queue draining issue with the IO
> throttling code which triggers the assert()s in the ide driver, I thought I
> should report a second bug I've seen. I'm not sure whether it's related, but
> none of the patch series posted so far appear to fix or affect it.
>
> I find that if I start a guest booting linux using extlinux and set a
> bytes-per-second throttle value less than about 4MB/s, qemu tends to lock up
> completely while the bootloader is loading the kernel. For example, there's
> a tiny 10MB ext4 filesystem gzipped up at
>
>  http://cdw.me.uk/tmp/test.img.gz
>
> which just contains extlinux and a kernel. If you run a VM with qemu HEAD as
>
>  qemu -m 1024 -vnc :1 -drive 
> if=none,id=ide.0.0,format=raw,cache=none,file=test.img,bps=10000000 -device 
> ide-drive,bus=ide.0,unit=0,bootindex=1,drive=ide.0.0 -monitor stdio
>
> and watch on VNC, you'll see it hangs whilst loading the kernel. Once this
> has happened, no further interaction with the monitor is possible, and the
> VNC socket becomes completely unresponsive. This happens about half of the
> time with bps set as high as 2*1024*1024.
>
> I first saw this with the version of the block throttling patches I'd
> back-ported on top of qemu-kvm 1.0, but have checked that the problem is
> still present in HEAD as of this afternoon [361dea401f52].

Thanks for reporting this.  Zhi Yong is travelling so he may not be
able to access email for a few days.

I downloaded your image and reproduced the issue on qemu.git/master
5bd33de6 ("tcg: fix sparc host for AREG0 free operation").  I set bps
to 1 MB per second, which is low but valid.  VNC and the QEMU monitor
froze.  I attached with gdb:

$ gdb -p 3705 x86_64-softmmu/qemu-system-x86_64
(gdb) thread apply all bt

Thread 2 (Thread 0x7f433dea9700 (LWP 3706)):
#0  0x00007f434745a690 in qemu_aio_wait () at aio.c:166
#1  0x00007f434746d2bd in bdrv_rw_co (bs=<optimized out>,
sector_num=<optimized out>, buf=<optimized out>, nb_sectors=<optimized
out>, is_write=<optimized out>) at block.c:1473
#2  0x00007f43474ed86e in ide_sector_read (s=0x7f43488d6a58) at
/home/stefanha/qemu/hw/ide/core.c:480
#3  0x00007f43474ecbf7 in ide_data_readw (opaque=<optimized out>,
addr=<optimized out>) at /home/stefanha/qemu/hw/ide/core.c:1692
#4  0x00007f43475d7d3b in memory_region_iorange_read
(iorange=0x7f434890bd70, offset=496, width=2, data=0x7f433dea8c50) at
/home/stefanha/qemu/memory.c:396
#5  0x00007f43475c84b7 in ioport_readw_thunk (opaque=<optimized out>,
addr=<optimized out>) at /home/stefanha/qemu/ioport.c:195
#6  0x00007f43475c8d82 in ioport_read (address=<optimized out>,
index=1) at /home/stefanha/qemu/ioport.c:70
#7  cpu_inw (addr=<optimized out>) at /home/stefanha/qemu/ioport.c:318
#8  0x00007f43475cbc21 in kvm_handle_io (count=256, size=2,
direction=0, data=<optimized out>, port=496) at
/home/stefanha/qemu/kvm-all.c:1117
#9  kvm_cpu_exec (env=0x7f4348870240) at /home/stefanha/qemu/kvm-all.c:1274
#10 0x00007f43475a7171 in qemu_kvm_cpu_thread_fn (arg=0x7f4348870240)
at /home/stefanha/qemu/cpus.c:733
#11 0x00007f43458dbb50 in start_thread (arg=<optimized out>) at
pthread_create.c:304
#12 0x00007f4343a0990d in clone () at
../sysdeps/unix/sysv/linux/x86_64/clone.S:112
#13 0x0000000000000000 in ?? ()

Thread 1 (Thread 0x7f43473a38c0 (LWP 3705)):
#0  __lll_lock_wait () at
../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:136
#1  0x00007f43458de339 in _L_lock_926 () from
/lib/x86_64-linux-gnu/libpthread.so.0
#2  0x00007f43458de15b in __pthread_mutex_lock (mutex=0x7f43482532c0)
at pthread_mutex_lock.c:61
#3  0x00007f4347555409 in qemu_mutex_lock (mutex=<optimized out>) at
qemu-thread-posix.c:54
#4  0x00007f434752b96c in main_loop_wait (nonblocking=<optimized out>)
at main-loop.c:460
#5  0x00007f4347454417 in main_loop () at /home/stefanha/qemu/vl.c:1552
#6  main (argc=<optimized out>, argv=<optimized out>, envp=<optimized
out>) at /home/stefanha/qemu/vl.c:3628

What this tells me is:

1. The vcpu thread is blocked in qemu_aio_wait() - it's waiting for
I/O request(s) to complete.
2. The iothread is trying to acquire the global mutex but is blocked
because the vcpu thread has it.  Therefore the monitor and VNC do not
work.

There is a throttled I/O request in a queue and a timer has been set
to wake up and issue the request.  The vcpu thread is in
qemu_aio_wait(), which does not invoke timer callbacks, so we have
deadlocked.  This is kind of a fundamental problem because timers use
the iothread event loop but we're in a synchronous context - we're in
the vcpu thread and the iothread will not be able to execute.

In this specific case it would be nice to convert hw/ide/* to use
bdrv_aio_*() instead of synchronous block I/O functions.  In the
general case we may need to build a warning or something into qemu to
catch this situation when it occurs.

Stefan



reply via email to

[Prev in Thread] Current Thread [Next in Thread]