|
From: | Avi Kivity |
Subject: | Re: [Qemu-devel] QEMU/KVM SCSI lock up |
Date: | Thu, 03 Apr 2008 11:38:08 +0300 |
User-agent: | Thunderbird 2.0.0.12 (X11/20080226) |
Matteo Frigo wrote:
kvm-64 hangs under heavy disk I/O with scsi disks. To reproduce, create a fresh qcow2 disk, boot linux, and execute dd if=/dev/sdX of=/dev/null bs=1M on the fresh disk. See also https://sourceforge.net/tracker/?func=detail&atid=893831&aid=1895893&group_id=180599 I have attached a patch that appears to fix the problem. The bug seems to be the following. scsi_read_data() does the following bdrv_aio_read() r->sector += n; r->sector_count -= n; For reasons that I do not fully understand, bdrv_aio_read() does not return immediately, but instead it calls scsi_read_data() recursively.
What happens (I think) is that bdrv_aio_read() completes immediately, calls the completion callback, which starts a read for the next batch of sectors.
Since ``r->sector += n;'' has not been executed yet, the re-entrant call triggers a read of the same sector, which breaks the producer-consumer lockstep. The fix is to swap the operations as follows: r->sector += n; r->sector_count -= n; bdrv_aio_read() A similar fix applies to scsi_write_data().
Will that not issue the read for the wrong sector?I think the correct fix is to move r->sector and r->sector_count adjustment into scsi_read_complete() and scsi_write_complete().
Long term we want to replace the recursion by queuing. -- error compiling committee.c: too many arguments to function
[Prev in Thread] | Current Thread | [Next in Thread] |