[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] IDE disk FLUSH take more than 30 secs, the SUSE guest r
From: |
Andreas Färber |
Subject: |
Re: [Qemu-devel] IDE disk FLUSH take more than 30 secs, the SUSE guest reports "lost interrupt and the file system becomes read-only" |
Date: |
Sun, 26 May 2013 19:08:02 +0200 |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:17.0) Gecko/20130329 Thunderbird/17.0.5 |
Am 21.05.2013 09:12, schrieb Gonglei (Arei):
> Through analysis, I found that because the system call the fdatasync command
> in the Qemu over 30s,
> after the Guest's kernel thread detects the io transferation is timeout, went
> to check IDE disk state.
> But the IDE disk status is 0x50, rather than the BSY status, and then
> departed error process...
>
> the path of kernel's action is :
> scsi_softirq_done
> scsi_eh_scmd_add
> scsi_error_handler
> shost->transportt->eh_strategy_handler
> ata_scsi_error
> ap->ops->lost_interrupt
> ata_sff_lost_interrupt
> Finally, the file system becomes read-only.
>
> Why not set the IDE disk for the BSY status When 0xe7 command is executed in
> the Qemu?
Have you actually tried that out with a patch such as the following?
diff --git a/hw/ide/core.c b/hw/ide/core.c
index c7a8041..bf1ff18 100644
--- a/hw/ide/core.c
+++ b/hw/ide/core.c
@@ -795,6 +795,8 @@ static void ide_flush_cb(void *opaque, int ret)
{
IDEState *s = opaque;
+ s->status &= ~BUSY_STAT;
+
if (ret < 0) {
/* XXX: What sector number to set here? */
if (ide_handle_rw_error(s, -ret, BM_STATUS_RETRY_FLUSH)) {
@@ -814,6 +816,7 @@ void ide_flush_cache(IDEState *s)
return;
}
+ s->status |= BUSY_STAT;
bdrv_acct_start(s->bs, &s->acct, 0, BDRV_ACCT_FLUSH);
bdrv_aio_flush(s->bs, ide_flush_cb, s);
}
No clue if this is spec-compliant. ;)
Note however that qemu_fdatasync() is done in the flush callback of
block/raw-posix.c, so IIUC everything calling bdrv_aio_flush() or
bdrv_flush_all() may potentially run into issues beyond just ATA:
hw/block/virtio-blk.c
hw/block/xen_disk.c
hw/ide/core.c
hw/scsi/scsi-disk.c
cpus.c:do_vm_stop()
hw/xen/xen_platform.c:platform_fixed_ioport_writew()
qemu_fdatasync() further occurs in:
hw/block/dataplane/virtio-blk.c:process_request()
hw/9pfs/virtio-9p-*.c
Quite possibly not all of them are problematic, but flush times >30 sec
are very likely not well tested by developers...
Regards,
Andreas
--
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany
GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer; HRB 16746 AG Nürnberg