[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] thread-pool.c race condition?
From: |
Paolo Bonzini |
Subject: |
Re: [Qemu-devel] thread-pool.c race condition? |
Date: |
Thu, 02 Apr 2015 18:43:43 +0200 |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.5.0 |
On 02/04/2015 18:26, Stefan Hajnoczi wrote:
> John Snow has reported that qemu-io can hang when the host is under
> heavy load. He made the following observations in gdb:
>
> 1. The program is sitting in aio_poll() (called by bdrv_prwv_co())
> waiting for request completion.
>
> 2. The thread pool has a ThreadPoolElement with ->state == THREAD_DONE.
>
> The ThreadPoolElement should have been reaped by
> thread_pool_completion_bh() and its callback invoked. For some reason
> this didn't happen and the program is blocked in poll(2) waiting.
>
> This suggests a race condition in thread-pool.c or qemu_bh_schedule()
> (used to complete ThreadPoolElement from a QEMU event loop).
>
> I don't have a good theory why this happens yet. Just wanted to share
> in case someone else hits this problem.
Laszlo hit something very similar fairly easily with virtio-scsi (but
not virtio-blk!) on aarch64 hosts. Any attempt to debug it (ranging
from compilation with -O0 to tracing) made it disappear. A reliable
reproducer with qemu-io would be a dream...
Paolo