[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Qemu-devel] question: a dead loop in qemu when do blockJobAbort and vm
From: |
l00284672 |
Subject: |
[Qemu-devel] question: a dead loop in qemu when do blockJobAbort and vm suspend coinstantaneously |
Date: |
Sat, 9 Jun 2018 17:10:10 +0800 |
User-agent: |
Mozilla/5.0 (Windows NT 6.1; WOW64; rv:52.0) Gecko/20100101 Thunderbird/52.4.0 |
Hi, I found a dead loop in qemu when do blockJobAbort and vm suspend
coinstantaneously.
The qemu bt is below:
#0 0x00007ff58b53af1f in ppoll () from /lib64/libc.so.6
#1 0x00000000007fdbd9 in ppoll (__ss=0x0, __timeout=0x7ffcf7055390,
__nfds=<optimized out>, __fds=<optimized out>) at
/usr/include/bits/poll2.h:77
#2 qemu_poll_ns (fds=<optimized out>, nfds=<optimized out>,
address@hidden) at util/qemu-timer.c:334
#3 0x00000000007ff83a in aio_poll (address@hidden,
address@hidden) at util/aio-posix.c:629
#4 0x0000000000776e91 in bdrv_drain_recurse (address@hidden) at
block/io.c:198
#5 0x0000000000776ef2 in bdrv_drain_recurse (address@hidden) at
block/io.c:215
#6 0x00000000007774b8 in bdrv_do_drained_begin (bs=0x3665990,
recursive=<optimized out>, parent=0x0) at block/io.c:291
#7 0x000000000076a79e in blk_drain (blk=0x2780fc0) at
block/block-backend.c:1586
#8 0x000000000072d2a9 in block_job_drain (job=0x29df040) at blockjob.c:123
#9 0x000000000072d228 in block_job_detach_aio_context
(opaque=0x29df040) at blockjob.c:139
#10 0x00000000007298b1 in bdrv_detach_aio_context
(address@hidden) at block.c:4885
#11 0x0000000000729a46 in bdrv_set_aio_context (bs=0x3665990,
new_context=0x268e140) at block.c:4946
#12 0x0000000000499743 in virtio_blk_data_plane_stop (vdev=<optimized
out>) at
/mnt/sdb/lzg/code/shequ_code/5_29/qemu/hw/block/dataplane/virtio-blk.c:285
#13 0x00000000006bce30 in virtio_bus_stop_ioeventfd (bus=0x3de5378) at
hw/virtio/virtio-bus.c:246
#14 0x00000000004c654d in virtio_vmstate_change (opaque=0x3de53f0,
running=<optimized out>, state=<optimized out>)
at /mnt/sdb/lzg/code/shequ_code/5_29/qemu/hw/virtio/virtio.c:2222
#15 0x0000000000561b52 in vm_state_notify (address@hidden,
address@hidden) at vl.c:1514
#16 0x000000000045d67a in do_vm_stop
(address@hidden, address@hidden)
at /mnt/sdb/lzg/code/shequ_code/5_29/qemu/cpus.c:1012
#17 0x000000000045dafd in vm_stop (address@hidden)
at /mnt/sdb/lzg/code/shequ_code/5_29/qemu/cpus.c:2035
#18 0x000000000057301b in qmp_stop (address@hidden) at
qmp.c:106
#19 0x000000000056bf7a in qmp_marshal_stop (args=<optimized out>,
ret=<optimized out>, errp=0x7ffcf7055738) at qapi/qapi-commands-misc.c:784
#20 0x00000000007f2d27 in do_qmp_dispatch (errp=0x7ffcf7055730,
request=0x3e121e0, cmds=<optimized out>) at qapi/qmp-dispatch.c:119
#21 qmp_dispatch (cmds=<optimized out>, address@hidden)
at qapi/qmp-dispatch.c:168
#22 0x00000000004655be in monitor_qmp_dispatch_one
(address@hidden) at
/mnt/sdb/lzg/code/shequ_code/5_29/qemu/monitor.c:4088
#23 0x0000000000465894 in monitor_qmp_bh_dispatcher (data=<optimized
out>) at /mnt/sdb/lzg/code/shequ_code/5_29/qemu/monitor.c:4146
#24 0x00000000007fc571 in aio_bh_call (bh=0x26de7e0) at util/async.c:90
#25 aio_bh_poll (address@hidden) at util/async.c:118
#26 0x00000000007ff6f0 in aio_dispatch (ctx=0x268dd50) at
util/aio-posix.c:436
#27 0x00000000007fc44e in aio_ctx_dispatch (source=<optimized out>,
callback=<optimized out>, user_data=<optimized out>) at util/async.c:261
#28 0x00007ff58bc7c99a in g_main_context_dispatch () from
/lib64/libglib-2.0.so.0
#29 0x00000000007fea3a in glib_pollfds_poll () at util/main-loop.c:215
#30 os_host_main_loop_wait (timeout=<optimized out>) at util/main-loop.c:238
#31 main_loop_wait (address@hidden) at util/main-loop.c:497
#32 0x0000000000561cad in main_loop () at vl.c:1848
#33 0x000000000041995c in main (argc=<optimized out>, argv=<optimized
out>, envp=<optimized out>) at vl.c:4605
The disk is a virtio-blk dataplane disk with a mirror job running. The
dead loop is here:
static void block_job_detach_aio_context(void *opaque)
{
BlockJob *job = opaque;
/* In case the job terminates during aio_poll()... */
job_ref(&job->job);
job_pause(&job->job);
while (!job->job.paused && !job_is_completed(&job->job)) {
job_drain(&job->job);
}
job->job.aio_context = NULL;
job_unref(&job->job);
}
The job is deferred to main loop now, but the job_drain only processes
the AIO context of bs which has no more work to do,
while the main loop BH is scheduled for setting the job->completed flag
is never processed.
I have tried many ways and want to slove it, but they all can not
slove it completely. Do you have any good ideas for it? Thanks for your
reply!
lizhengui.vcf
Description: Vcard
- [Qemu-devel] question: a dead loop in qemu when do blockJobAbort and vm suspend coinstantaneously,
l00284672 <=