qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] coroutines: block: Co-routine re-entered recursively wh


From: Jason J. Herne
Subject: Re: [Qemu-devel] coroutines: block: Co-routine re-entered recursively when migrating disk with iothreads
Date: Tue, 24 May 2016 11:05:12 -0400
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.6.0

On 05/23/2016 10:12 PM, Fam Zheng wrote:
On Mon, 05/23 14:54, Jason J. Herne wrote:
Using libvirt to migrate a guest and one guest disk that is using iothreads
causes Qemu to crash with the message:
Co-routine re-entered recursively

I've looked into this one a bit but I have not seen anything that
immediately stands out.
Here is what I have found:

In qemu_coroutine_enter:
     if (co->caller) {
         fprintf(stderr, "Co-routine re-entered recursively\n");
         abort();
     }

The value of co->caller is actually changing between the time "if
(co->caller)" is evaluated and the time I print some debug statements
directly under the existing fprintf. I confirmed this by saving the value in
a local variable and printing both the new local variable and co->caller
immediately after the existing fprintf. This would certainly indicate some
kind of concurrency issue. However, it does not necessarily point to the
reason we ended up inside this if statement because co->caller was not NULL
before it was trashed. Perhaps it was trashed more than once then? I figured
maybe the problem was with coroutine pools so I disabled them
(--disable-coroutine-pool) and still hit the bug.

Which coroutine backend are you using?


ucontext normally. I've also reproduced the problem with sigaltstack.


The backtrace is not always identical. Here is one instance:
(gdb) bt
#0  0x000003ffa78be2c0 in raise () from /lib64/libc.so.6
#1  0x000003ffa78bfc26 in abort () from /lib64/libc.so.6
#2  0x0000000080427d80 in qemu_coroutine_enter (co=0xa2cf2b40, opaque=0x0)
at /root/kvmdev/qemu/util/qemu-coroutine.c:112
#3  0x000000008032246e in nbd_restart_write      (opaque=0xa2d0cd40) at
/root/kvmdev/qemu/block/nbd-client.c:114
#4  0x00000000802b3a1c in aio_dispatch (ctx=0xa2c907a0) at
/root/kvmdev/qemu/aio-posix.c:341
#5  0x00000000802b4332 in aio_poll (ctx=0xa2c907a0, blocking=true) at
/root/kvmdev/qemu/aio-posix.c:479
#6  0x0000000080155aba in iothread_run (opaque=0xa2c90260) at
/root/kvmdev/qemu/iothread.c:46
#7  0x000003ffa7a87c2c in start_thread () from /lib64/libpthread.so.0
#8  0x000003ffa798ec9a in thread_start () from /lib64/libc.so.6

It may be worth looking at backtrace of all threads especially the monitor
thread (main thread).


Here is a complete backtrace: (gdb) thread apply all bt
Thread #1 is main.
Thread #13 is the crashing thread.

Thread 50 (Thread 0x3fdeb1f1910 (LWP 29570)):
#0  0x000003ff99c901fc in do_futex_wait () from /lib64/libpthread.so.0
#1  0x000003ff99c90302 in __new_sem_wait_slow () from /lib64/libpthread.so.0
#2 0x00000000804097a4 in qemu_sem_timedwait (sem=0x3ff8c000ce8, ms=10000) at /root/kvmdev/qemu/util/qemu-thread-posix.c:245 #3 0x00000000802a215e in worker_thread (opaque=0x3ff8c000c80) at /root/kvmdev/qemu/thread-pool.c:92
#4  0x000003ff99c87c2c in start_thread () from /lib64/libpthread.so.0
#5  0x000003ff99b8ec9a in thread_start () from /lib64/libc.so.6

Thread 49 (Thread 0x3fdb47ff910 (LWP 29569)):
#0  0x000003ff99c901fc in do_futex_wait () from /lib64/libpthread.so.0
#1  0x000003ff99c90302 in __new_sem_wait_slow () from /lib64/libpthread.so.0
#2 0x00000000804097a4 in qemu_sem_timedwait (sem=0x9c54c8d8, ms=10000) at /root/kvmdev/qemu/util/qemu-thread-posix.c:245 #3 0x00000000802a215e in worker_thread (opaque=0x9c54c870) at /root/kvmdev/qemu/thread-pool.c:92
#4  0x000003ff99c87c2c in start_thread () from /lib64/libpthread.so.0
#5  0x000003ff99b8ec9a in thread_start () from /lib64/libc.so.6

Thread 15 (Thread 0x3ff999ff910 (LWP 29449)):
#0  0x000003ff99b8841e in syscall () from /lib64/libc.so.6
#1 0x00000000804099c6 in futex_wait (ev=0x80ac597c <rcu_call_ready_event>, val=4294967295) at /root/kvmdev/qemu/util/qemu-thread-posix.c:292 #2 0x0000000080409c56 in qemu_event_wait (ev=0x80ac597c <rcu_call_ready_event>) at /root/kvmdev/qemu/util/qemu-thread-posix.c:399 #3 0x00000000804271ec in call_rcu_thread (opaque=0x0) at /root/kvmdev/qemu/util/rcu.c:250
#4  0x000003ff99c87c2c in start_thread () from /lib64/libpthread.so.0
#5  0x000003ff99b8ec9a in thread_start () from /lib64/libc.so.6

Thread 14 (Thread 0x3ff991ff910 (LWP 29451)):
#0  0x000003ff99b8841e in syscall () from /lib64/libc.so.6
#1  0x000003ff9a19e330 in g_cond_wait () from /lib64/libglib-2.0.so.0
#2 0x000000008039d936 in wait_for_trace_records_available () at /root/kvmdev/qemu/trace/simple.c:147 #3 0x000000008039d9c6 in writeout_thread (opaque=0x0) at /root/kvmdev/qemu/trace/simple.c:165
#4  0x000003ff9a17c4cc in g_thread_proxy () from /lib64/libglib-2.0.so.0
#5  0x000003ff99c87c2c in start_thread () from /lib64/libpthread.so.0
#6  0x000003ff99b8ec9a in thread_start () from /lib64/libc.so.6

Thread 13 (Thread 0x3ff989ff910 (LWP 29452)):
#0  0x000003ff99abe2c0 in raise () from /lib64/libc.so.6
#1  0x000003ff99abfc26 in abort () from /lib64/libc.so.6
#2 0x0000000080427d80 in qemu_coroutine_enter (co=0x9c5a4120, opaque=0x0) at /root/kvmdev/qemu/util/qemu-coroutine.c:112 #3 0x000000008032246e in nbd_restart_write (opaque=0x9c5897b0) at /root/kvmdev/qemu/block/nbd-client.c:114 #4 0x00000000802b3a1c in aio_dispatch (ctx=0x9c530770) at /root/kvmdev/qemu/aio-posix.c:341 #5 0x00000000802b4332 in aio_poll (ctx=0x9c530770, blocking=true) at /root/kvmdev/qemu/aio-posix.c:479 #6 0x0000000080155aba in iothread_run (opaque=0x9c530200) at /root/kvmdev/qemu/iothread.c:46
#7  0x000003ff99c87c2c in start_thread () from /lib64/libpthread.so.0
#8  0x000003ff99b8ec9a in thread_start () from /lib64/libc.so.6

Thread 12 (Thread 0x3ff91b5c910 (LWP 29456)):
#0 0x000003ff99c8d68a in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0 #1 0x000000008040932e in qemu_cond_wait (cond=0x9c904690, mutex=0x8065aab0 <qemu_global_mutex>) at /root/kvmdev/qemu/util/qemu-thread-posix.c:123 #2 0x000000008005c1d6 in qemu_kvm_wait_io_event (cpu=0x9c8c8e80) at /root/kvmdev/qemu/cpus.c:1030 #3 0x000000008005c37a in qemu_kvm_cpu_thread_fn (arg=0x9c8c8e80) at /root/kvmdev/qemu/cpus.c:1069
#4  0x000003ff99c87c2c in start_thread () from /lib64/libpthread.so.0
#5  0x000003ff99b8ec9a in thread_start () from /lib64/libc.so.6

Thread 1 (Thread 0x3ff9a6f2a90 (LWP 29433)):
#0 0x000003ff99c8d68a in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0 #1 0x000000008040932e in qemu_cond_wait (cond=0x9c530800, mutex=0x9c5307d0) at /root/kvmdev/qemu/util/qemu-thread-posix.c:123 #2 0x0000000080426a38 in rfifolock_lock (r=0x9c5307d0) at /root/kvmdev/qemu/util/rfifolock.c:59 #3 0x00000000802a1f72 in aio_context_acquire (ctx=0x9c530770) at /root/kvmdev/qemu/async.c:373 #4 0x00000000802b3f54 in aio_poll (ctx=0x9c530770, blocking=true) at /root/kvmdev/qemu/aio-posix.c:415 #5 0x000000008031e7ac in bdrv_flush (bs=0x9c59b5c0) at /root/kvmdev/qemu/block/io.c:2470 #6 0x00000000802a8e6e in bdrv_close (bs=0x9c59b5c0) at /root/kvmdev/qemu/block.c:2134 #7 0x00000000802a9966 in bdrv_delete (bs=0x9c59b5c0) at /root/kvmdev/qemu/block.c:2341 #8 0x00000000802ac7c6 in bdrv_unref (bs=0x9c59b5c0) at /root/kvmdev/qemu/block.c:3376 #9 0x0000000080315340 in mirror_exit (job=0x9c956ed0, opaque=0x9c9570d0) at /root/kvmdev/qemu/block/mirror.c:494 #10 0x00000000802afb52 in block_job_defer_to_main_loop_bh (opaque=0x9c90dc10) at /root/kvmdev/qemu/blockjob.c:476 #11 0x00000000802a10a8 in aio_bh_call (bh=0x9c9090a0) at /root/kvmdev/qemu/async.c:66 #12 0x00000000802a1206 in aio_bh_poll (ctx=0x9c51e6c0) at /root/kvmdev/qemu/async.c:94 #13 0x00000000802b389e in aio_dispatch (ctx=0x9c51e6c0) at /root/kvmdev/qemu/aio-posix.c:308 #14 0x00000000802a1854 in aio_ctx_dispatch (source=0x9c51e6c0, callback=0x0, user_data=0x0) at /root/kvmdev/qemu/async.c:233 #15 0x000003ff9a151c0a in g_main_context_dispatch () from /lib64/libglib-2.0.so.0 #16 0x00000000802b05ce in glib_pollfds_poll () at /root/kvmdev/qemu/main-loop.c:213 #17 0x00000000802b070a in os_host_main_loop_wait (timeout=0) at /root/kvmdev/qemu/main-loop.c:258 #18 0x00000000802b0816 in main_loop_wait (nonblocking=0) at /root/kvmdev/qemu/main-loop.c:506
#19 0x000000008016d434 in main_loop () at /root/kvmdev/qemu/vl.c:1934
#20 0x00000000801756b8 in main (argc=54, argv=0x3ffdcd7ee58, envp=0x3ffdcd7f010) at /root/kvmdev/qemu/vl.c:4656

--
-- Jason J. Herne (address@hidden)




reply via email to

[Prev in Thread] Current Thread [Next in Thread]