qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH v2] block: avoid SIGUSR2


From: Cleber Rosa
Subject: Re: [Qemu-devel] [PATCH v2] block: avoid SIGUSR2
Date: Fri, 28 Oct 2011 09:20:23 -0300
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:7.0) Gecko/20110927 Thunderbird/7.0

On 10/28/2011 08:33 AM, Kevin Wolf wrote:
Am 27.10.2011 16:32, schrieb Kevin Wolf:
Am 27.10.2011 16:15, schrieb Kevin Wolf:
Am 27.10.2011 15:57, schrieb Stefan Hajnoczi:
On Thu, Oct 27, 2011 at 03:26:23PM +0200, Kevin Wolf wrote:
Am 19.09.2011 16:37, schrieb Frediano Ziglio:
Now that iothread is always compiled sending a signal seems only an
additional step. This patch also avoid writing to two pipe (one from signal
and one in qemu_service_io).

Work with kvm enabled or disabled. strace output is more readable (less 
syscalls).

Signed-off-by: Frediano Ziglio<address@hidden>
Something in this change has bad effects, in the sense that it seems to
break bdrv_read_em.
How does it break bdrv_read_em?  Are you seeing QEMU hung with 100% CPU
utilization or deadlocked?
Sorry, I should have been more detailed here.

No, it's nothing obvious, it must be some subtle side effect. The result
of bdrv_read_em itself seems to be correct (return value and checksum of
the read buffer).

However instead of booting into the DOS setup I only get an error
message "Kein System oder Laufwerksfehler" (don't know how it reads in
English DOS versions), which seems to be produced by the boot sector.

I excluded all of the minor changes, so I'm sure that it's caused by the
switch from kill() to a direct call of the function that writes into the
pipe.

One interesting thing is that qemu_aio_wait() does not release the QEMU
mutex, so we cannot write to a pipe with the mutex held and then spin
waiting for the iothread to do work for us.

Exactly how kill and qemu_notify_event() were different I'm not sure
right now but it could be a factor.
This would cause a hang, right? Then it isn't what I'm seeing.
While trying out some more things, I added some fprintfs to
posix_aio_process_queue() and suddenly it also fails with the kill()
version. So what has changed might really just be the timing, and it
could be a race somewhere that has always (?) existed.
Replying to myself again... It looks like there is a problem with
reentrancy in fdctrl_transfer_handler. I think this would have been
guarded by the AsyncContexts before, but we don't have them any more.

qemu-system-x86_64: /root/upstream/qemu/hw/fdc.c:1253:
fdctrl_transfer_handler: Assertion `reentrancy == 0' failed.

Program received signal SIGABRT, Aborted.

(gdb) bt
#0  0x0000003ccd2329a5 in raise () from /lib64/libc.so.6
#1  0x0000003ccd234185 in abort () from /lib64/libc.so.6
#2  0x0000003ccd22b935 in __assert_fail () from /lib64/libc.so.6
#3  0x000000000046ff09 in fdctrl_transfer_handler (opaque=<value
optimized out>, nchan=<value optimized out>, dma_pos=<value optimized out>,
     dma_len=<value optimized out>) at /root/upstream/qemu/hw/fdc.c:1253
#4  0x000000000046702c in channel_run () at /root/upstream/qemu/hw/dma.c:348
#5  DMA_run () at /root/upstream/qemu/hw/dma.c:378
#6  0x000000000040b0e1 in qemu_bh_poll () at async.c:70
#7  0x000000000040aa19 in qemu_aio_wait () at aio.c:147
#8  0x000000000041c355 in bdrv_read_em (bs=0x131fd80, sector_num=19,
buf=<value optimized out>, nb_sectors=1) at block.c:2896
#9  0x000000000041b3d2 in bdrv_read (bs=0x131fd80, sector_num=19,
buf=0x1785a00 "IO      SYS!", nb_sectors=1) at block.c:1062
#10 0x000000000041b3d2 in bdrv_read (bs=0x131f430, sector_num=19,
buf=0x1785a00 "IO      SYS!", nb_sectors=1) at block.c:1062
#11 0x000000000046fbb8 in do_fdctrl_transfer_handler (opaque=0x1785788,
nchan=2, dma_pos=<value optimized out>, dma_len=512)
     at /root/upstream/qemu/hw/fdc.c:1178
#12 0x000000000046fecf in fdctrl_transfer_handler (opaque=<value
optimized out>, nchan=<value optimized out>, dma_pos=<value optimized out>,
     dma_len=<value optimized out>) at /root/upstream/qemu/hw/fdc.c:1255
#13 0x000000000046702c in channel_run () at /root/upstream/qemu/hw/dma.c:348
#14 DMA_run () at /root/upstream/qemu/hw/dma.c:378
#15 0x000000000046e456 in fdctrl_start_transfer (fdctrl=0x1785788,
direction=1) at /root/upstream/qemu/hw/fdc.c:1107
#16 0x0000000000558a41 in kvm_handle_io (env=0x1323ff0) at
/root/upstream/qemu/kvm-all.c:834
#17 kvm_cpu_exec (env=0x1323ff0) at /root/upstream/qemu/kvm-all.c:976
#18 0x000000000053686a in qemu_kvm_cpu_thread_fn (arg=0x1323ff0) at
/root/upstream/qemu/cpus.c:661
#19 0x0000003ccda077e1 in start_thread () from /lib64/libpthread.so.0
#20 0x0000003ccd2e151d in clone () from /lib64/libc.so.6

I'm afraid that we can only avoid things like this reliably if we
convert all devices to be direct users of AIO/coroutines. The current
block layer infrastructure doesn't emulate the behaviour of bdrv_read
accurately as bottom halves can be run in the nested main loop.

For floppy, the following seems to be a quick fix (Lucas, Cleber, does
this solve your problems?), though it's not very satisfying. And I'm not
quite sure yet why it doesn't always happen with kill() in
posix-aio-compat.c.

diff --git a/hw/dma.c b/hw/dma.c
index 8a7302a..1d3b6f1 100644
--- a/hw/dma.c
+++ b/hw/dma.c
@@ -358,6 +358,13 @@ static void DMA_run (void)
      struct dma_cont *d;
      int icont, ichan;
      int rearm = 0;
+    static int running = 0;
+
+    if (running) {
+        goto out;
+    } else {
+        running = 0;
+    }

      d = dma_controllers;

@@ -374,6 +381,8 @@ static void DMA_run (void)
          }
      }

+out:
+    running = 0;
      if (rearm)
          qemu_bh_schedule_idle(dma_bh);
  }

Kevin

Kevin,

In my quick test (compiling qemu.git master + your dma patch, and running a FreeDOS floppy image) it does not have any visible difference.

The boot is still stuck after printing "FreeDOS" at the console.

PS: We will trigger a full blown test, with a Windows installation using a floppy, but the results with the FreeDOS floppy have been very consistent with the full blown test.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]