Re: [PATCH v1 0/2] Add timeout mechanism to qmp actions

qemu-devel

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v1 0/2] Add timeout mechanism to qmp actions

From:	Stefan Hajnoczi
Subject:	Re: [PATCH v1 0/2] Add timeout mechanism to qmp actions
Date:	Mon, 14 Sep 2020 14:27:38 +0100

On Tue, Aug 11, 2020 at 09:54:08PM +0800, Zhenyu Ye wrote:
> Hi Kevin,
> 
> On 2020/8/10 23:38, Kevin Wolf wrote:
> > Am 10.08.2020 um 16:52 hat Zhenyu Ye geschrieben:
> >> Before doing qmp actions, we need to lock the qemu_global_mutex,
> >> so the qmp actions should not take too long time.
> >>
> >> Unfortunately, some qmp actions need to acquire aio context and
> >> this may take a long time.  The vm will soft lockup if this time
> >> is too long.
> > 
> > Do you have a specific situation in mind where getting the lock of an
> > AioContext can take a long time? I know that the main thread can
> > block for considerable time, but QMP commands run in the main thread, so
> > this patch doesn't change anything for this case. It would be effective
> > if an iothread blocks, but shouldn't everything running in an iothread
> > be asynchronous and therefore keep the AioContext lock only for a short
> > time?
> > 
> 
> Theoretically, everything running in an iothread is asynchronous. However,
> some 'asynchronous' actions are not non-blocking entirely, such as
> io_submit().  This will block while the iodepth is too big and I/O pressure
> is too high.  If we do some qmp actions, such as 'info block', at this time,
> may cause vm soft lockup.  This series can make these qmp actions safer.
> 
> I constructed the scene as follow:
> 1. create a vm with 4 disks, using iothread.
> 2. add press to the CPU on the host.  In my scene, the CPU usage exceeds 95%.
> 3. add press to the 4 disks in the vm at the same time.  I used the fio and
> some parameters are:
> 
>        fio -rw=randrw -bs=1M -size=1G -iodepth=512 -ioengine=libaio -numjobs=4
> 
> 4. do block query actions, for example, by virsh:
> 
>       virsh qemu-monitor-command [vm name] --hmp info block
> 
> Then the vm will soft lockup, the calltrace is:
> 
> [  192.311393] watchdog: BUG: soft lockup - CPU#1 stuck for 42s! 
> [kworker/1:1:33]

Hi,
Sorry I haven't had time to investigate this myself yet.

Do you also have a QEMU backtrace when the hang occurs?

Let's find out if QEMU is stuck in the io_submit(2) syscall or whether
there's an issue in QEMU itself that causes the softlockup (for example,
aio_poll() with the global mutex held).

Stefan

signature.asc
Description: PGP signature

[Prev in Thread]

Current Thread

[Next in Thread]

Re: [PATCH v1 0/2] Add timeout mechanism to qmp actions, Stefan Hajnoczi <=
- Re: [PATCH v1 0/2] Add timeout mechanism to qmp actions, Zhenyu Ye, 2020/09/17
  - Re: [PATCH v1 0/2] Add timeout mechanism to qmp actions, Fam Zheng, 2020/09/17
  - Re: [PATCH v1 0/2] Add timeout mechanism to qmp actions, Stefan Hajnoczi, 2020/09/17
    - Re: [PATCH v1 0/2] Add timeout mechanism to qmp actions, Fam Zheng, 2020/09/17
    - Re: [PATCH v1 0/2] Add timeout mechanism to qmp actions, Zhenyu Ye, 2020/09/18
    - Re: [PATCH v1 0/2] Add timeout mechanism to qmp actions, Fam Zheng, 2020/09/18
    - Re: [PATCH v1 0/2] Add timeout mechanism to qmp actions, Zhenyu Ye, 2020/09/18
    - Re: [PATCH v1 0/2] Add timeout mechanism to qmp actions, Fam Zheng, 2020/09/21
- Re: [PATCH v1 0/2] Add timeout mechanism to qmp actions, Daniel P . Berrangé, 2020/09/14
  - Re: [PATCH v1 0/2] Add timeout mechanism to qmp actions, Zhenyu Ye, 2020/09/17

Prev by Date: Re: [PATCH v5 0/5] fix & merge block_status_above and is_allocated_above
Next by Date: Re: [PATCH v8 0/6] block: seriously improve savevm/loadvm performance
Previous by thread: Re: [PATCH v5 0/5] fix & merge block_status_above and is_allocated_above
Next by thread: Re: [PATCH v1 0/2] Add timeout mechanism to qmp actions
Index(es):
- Date
- Thread