qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Qemu-devel] [Bug 1824053] Re: Qemu-img convert appears to be stuck on a


From: dann frazier
Subject: [Qemu-devel] [Bug 1824053] Re: Qemu-img convert appears to be stuck on aarch64 host with low probability
Date: Thu, 06 Jun 2019 22:57:51 -0000

*** This bug is a duplicate of bug 1805256 ***
    https://bugs.launchpad.net/bugs/1805256

** This bug has been marked a duplicate of bug 1805256
   qemu-img hangs on high core count ARM system

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1824053

Title:
  Qemu-img convert appears to be stuck on aarch64 host with low
  probability

Status in QEMU:
  Confirmed

Bug description:
  Hi,  I found a problem that qemu-img convert appears to be stuck on
  aarch64 host with low probability.

  The convert command  line is  "qemu-img convert -f qcow2 -O raw
  disk.qcow2 disk.raw ".

  The bt is below:

  Thread 2 (Thread 0x40000b776e50 (LWP 27215)):
  #0  0x000040000a3f2994 in sigtimedwait () from /lib64/libc.so.6
  #1  0x000040000a39c60c in sigwait () from /lib64/libpthread.so.0
  #2  0x0000aaaaaae82610 in sigwait_compat (opaque=0xaaaac5163b00) at 
util/compatfd.c:37
  #3  0x0000aaaaaae85038 in qemu_thread_start (args=args@entry=0xaaaac5163b90) 
at util/qemu_thread_posix.c:496
  #4  0x000040000a3918bc in start_thread () from /lib64/libpthread.so.0
  #5  0x000040000a492b2c in thread_start () from /lib64/libc.so.6

  Thread 1 (Thread 0x40000b573370 (LWP 27214)):
  #0  0x000040000a489020 in ppoll () from /lib64/libc.so.6
  #1  0x0000aaaaaadaefc0 in ppoll (__ss=0x0, __timeout=0x0, __nfds=<optimized 
out>, __fds=<optimized out>) at /usr/include/bits/poll2.h:77
  #2  qemu_poll_ns (fds=<optimized out>, nfds=<optimized out>, 
timeout=<optimized out>) at qemu_timer.c:391
  #3  0x0000aaaaaadae014 in os_host_main_loop_wait (timeout=<optimized out>) at 
main_loop.c:272
  #4  0x0000aaaaaadae190 in main_loop_wait (nonblocking=<optimized out>) at 
main_loop.c:534
  #5  0x0000aaaaaad97be0 in convert_do_copy (s=0xffffdc32eb48) at 
qemu-img.c:1923
  #6  0x0000aaaaaada2d70 in img_convert (argc=<optimized out>, argv=<optimized 
out>) at qemu-img.c:2414
  #7  0x0000aaaaaad99ac4 in main (argc=7, argv=<optimized out>) at 
qemu-img.c:5305

  The problem seems to be very similar to the phenomenon described by
  this patch (https://resources.ovirt.org/pub/ovirt-4.1/src/qemu-kvm-
  ev/0025-aio_notify-force-main-loop-wakeup-with-SIGIO-aarch64.patch),

  which force main loop wakeup with SIGIO.  But this patch was reverted
  by the patch (http://ovirt.repo.nfrance.com/src/qemu-kvm-ev/kvm-
  Revert-aio_notify-force-main-loop-wakeup-with-SIGIO-.patch).

  I can reproduce this problem with qemu.git/matser. It still exists in 
qemu.git/matser. I found that when an IO return in
  worker threads and want to call aio_notify to wake up main_loop, but it found 
that ctx->notify_me is cleared to 0 by main_loop in aio_ctx_check by calling 
atomic_and(&ctx->notify_me, ~1) . So worker thread won't write enventfd to 
notify main_loop. If such a scene happens, the main_loop will hang:
      main loop                        worker thread1                         
worker thread2
  
------------------------------------------------------------------------------------------
       
       qemu_poll_ns                     aio_worker        
                                      qemu_bh_schedule(pool->completion_bh)     
                         
      glib_pollfds_poll
      g_main_context_check
      aio_ctx_check                                                         
aio_worker                                                                      
                              
      atomic_and(&ctx->notify_me, ~1)                 
qemu_bh_schedule(pool->completion_bh)                          
                                                                                
 
      /* do something for event */   
      qemu_poll_ns
      /* hangs !!!*/

  As we known ,ctx->notify_me will be visited by worker thread and main
  loop. I thank we should add a lock protection for ctx->notify_me to
  avoid this happend.

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1824053/+subscriptions



reply via email to

[Prev in Thread] Current Thread [Next in Thread]