[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] test-filter-mirror hangs

From: Jason Wang
Subject: Re: [Qemu-devel] test-filter-mirror hangs
Date: Wed, 23 Jan 2019 10:43:11 +0800
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.2.1

On 2019/1/22 上午2:56, Peter Maydell wrote:
On Thu, 17 Jan 2019 at 09:46, Jason Wang <address@hidden> wrote:

On 2019/1/15 上午12:33, Zhang Chen wrote:

On Sat, Jan 12, 2019 at 12:15 AM Dr. David Alan Gilbert
<address@hidden <mailto:address@hidden>> wrote:

     * Peter Maydell (address@hidden
     <mailto:address@hidden>) wrote:
     > Recently I've noticed that test-filter-mirror has been hanging
     > intermittently, typically when run on some other TCG architecture.
     > In the instance I've just looked at, this was with s390x guest on
     > x86-64 host, though I've also seen it on other host archs and
     > perhaps with other guests.

     Watch out to see if you really do see it for other guests;
     it carefully avoids using virtio-net to avoid vhost; but on s390x it
     uses virtio-net-ccw - could that hit the vhost it was trying to avoid?

     > Below is a backtrace, though it seems to be pretty unhelpful.
     > Anybody got any theories ? Does the mirror test rely on dirty
     > memory bitmaps like the migration test (which also hangs
     > occasionally with TCG due to some bug I'm sure we've investigated
     > in the past) ?

     I don't think it relies on the CPU at all.
  I have no idea about this currently, but Jason and I designed the
test case.
Add Jason: Have any comments about this ?

I can't reproduce this locally with s390x-softmmu. It looks to me the
test should be independent to any kinds of emulation. It should pass
when mainloop work.
I've just seen a hang with ppc64 guest on s390x host, so it is
indeed not specific to s390x guest (and so not specific to
virtio-net either, since the ppc64 guest setup uses e1000).

-- PMM

Finally reproduced locally after hundreds (sometimes thousands) times of running.

Bisection points to OOB monitor[1].

It looks to me after OOB is used unconditionally we lose a barrier to make sure socket is connected before sending packets in test-filter-mirror.c. Is there any other similar and simple thing that we could do to kick the mainloop?



commit 8258292e18c39480b64eba9f3551ab772ce29b5d (HEAD, refs/bisect/bad)
Author: Peter Xu <address@hidden>
Date:   Tue Oct 9 14:27:15 2018 +0800

    monitor: Remove "x-oob", offer capability "oob" unconditionally

    Out-of-band command execution was introduced in commit cf869d53172.
    Unfortunately, we ran into a regression, and had to turn it into an
    experimental option for 2.12 (commit be933ffc23).


    The regression has since been fixed (commit 951702f39c7 "monitor: bind
    dispatch bh to iohandler context").  A thorough re-review of OOB
    commands led to a few more issues, which have also been addressed.

    This patch partly reverts be933ffc23 (monitor: new parameter "x-oob"),
    and makes QMP monitors again offer capability "oob" whenever they can
    provide it, i.e. when the monitor's character device is capable of
    running in an I/O thread.

    Some trivial touch-up in the test code is required to make sure qmp-test
    won't break.

    Reviewed-by: Markus Armbruster <address@hidden>
    Reviewed-by: Marc-André Lureau <address@hidden>
    Signed-off-by: Peter Xu <address@hidden>
    Message-Id: <address@hidden>
    [Conflict with "monitor: check if chardev can switch gcontext for OOB"
    resolved, commit message updated]
    Signed-off-by: Markus Armbruster <address@hidden>

reply via email to

[Prev in Thread] Current Thread [Next in Thread]