qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: iotest40 problem


From: Li Zhang
Subject: Re: iotest40 problem
Date: Thu, 31 Mar 2022 12:44:16 +0200
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.5.0

On 3/31/22 08:10, Hanna Reitz wrote:
On 29.03.22 18:49, Li Zhang wrote:
Update what I observed.

It seems that aqmp is not stable when running test cases.
So I revert the patches as the following, iotest40/41 test cases work well.

Thanks for bisecting.  I haven’t seen this problem before, so I didn’t look into it; CC-ing John, perhaps he figures something.

Thanks a lot.
It happens occassionally. I didn't find a way to reproduce it manually.

Hanna

commit 76cd358671e6b8e7c435ec65b1c44200254514a9

Author: John Snow <jsnow@redhat.com>

Date:   Tue Oct 26 13:56:12 2021 -0400



    python, iotests: replace qmp with aqmp



    Swap out the synchronous QEMUMonitorProtocol from qemu.qmp with the sync

    wrapper from qemu.aqmp instead.



    Add an escape hatch in the form of the environment variable

    QEMU_PYTHON_LEGACY_QMP which allows you to cajole QEMUMachine into using

    the old implementation, proving that both implementations work

    concurrently.



    Signed-off-by: John Snow <jsnow@redhat.com>

    Reviewed-by: Kevin Wolf <kwolf@redhat.com>

    Reviewed-by: Hanna Reitz <hreitz@redhat.com>

    Message-id: 20211026175612.4127598-9-jsnow@redhat.com

    Signed-off-by: John Snow <jsnow@redhat.com>


commit 1611e6cf4e7163f6102b37010a8b7e7120f468b5

Author: John Snow <jsnow@redhat.com>

Date:   Thu Nov 18 15:46:18 2021 -0500



    python/machine: handle "fast" QEMU terminations



    In the case that the QEMU process actually launches -- but then dies so

    quickly that we can't establish a QMP connection to it -- QEMUMachine

    currently calls _post_shutdown() assuming that it never launched the VM

    process.



    This isn't true, though: it "merely" may have failed to establish a QMP

    connection and the process is in the middle of its own exit path.



    If we don't wait for the subprocess, the caller may get a bogus `None`

    return for .exitcode(). This behavior was observed from

    device-crash-test; after the switch to Async QMP, the timings were

    changed such that it was now seemingly possible to witness the failure

    of "vm.launch()" *prior* to the exitcode becoming available.



    The semantic of the `_launched` property is changed in this

    patch. Instead of representing the condition "launch() executed

    successfully", it will now represent "has forked a child process

    successfully". This way, wait() when called in the exit path won't

    become a no-op.



    Signed-off-by: John Snow <jsnow@redhat.com>

    Reviewed-by: Willian Rampazzo <willianr@redhat.com>

    Message-id: 20211118204620.1897674-6-jsnow@redhat.com

    Signed-off-by: John Snow <jsnow@redhat.com>





On 3/25/22 11:17, Li Zhang wrote:
Hi,

I backport some iotests patches to the tree and change timeout.
It doesn't work.

Sometimes, iotest41 also reports the errors.
[ 1347s] +======================================================================
[ 1347s] +ERROR: test_top_node_in_wrong_chain (__main__.TestSingleDrive)
[ 1347s] +----------------------------------------------------------------------
[ 1347s] +Traceback (most recent call last):
[ 1347s] +  File "/home/abuild/rpmbuild/BUILD/qemu-6.2.0/python/qemu/machine/machine.py", line 399, in launch
[ 1347s] +    self._launch()
[ 1347s] +  File "/home/abuild/rpmbuild/BUILD/qemu-6.2.0/python/qemu/machine/machine.py", line 434, in _launch
[ 1347s] +    self._post_launch()
[ 1347s] +  File "/home/abuild/rpmbuild/BUILD/qemu-6.2.0/python/qemu/machine/qtest.py", line 147, in _post_launch
[ 1347s] +    super()._post_launch()
[ 1347s] +  File "/home/abuild/rpmbuild/BUILD/qemu-6.2.0/python/qemu/machine/machine.py", line 340, in _post_launch
[ 1347s] +    self._qmp.accept(self._qmp_timer)
[ 1347s] +  File "/home/abuild/rpmbuild/BUILD/qemu-6.2.0/python/qemu/aqmp/legacy.py", line 69, in accept
[ 1347s] +    timeout
[ 1347s] +  File "/home/abuild/rpmbuild/BUILD/qemu-6.2.0/python/qemu/aqmp/legacy.py", line 42, in _sync
[ 1347s] +    asyncio.wait_for(future, timeout=timeout)
[ 1347s] +  File "/usr/lib64/python3.6/asyncio/base_events.py", line 488, in run_until_complete
[ 1347s] +    return future.result()
[ 1347s] +  File "/usr/lib64/python3.6/asyncio/tasks.py", line 362, in wait_for
[ 1347s] +    raise futures.TimeoutError()
[ 1347s] +concurrent.futures._base.TimeoutError


I can see other errors like this, it's the problem of the socket.

[ 1535s] socket_accept failed: Resource temporarily unavailable
[ 1535s] **
[ 1535s] ERROR:../tests/qtest/libqtest.c:321:qtest_init_without_qmp_handshake: assertion failed: (s->fd >= 0 && s->qmp_fd >= 0)


The script is running the command as this:
/usr/bin/make -O -j4 check-block V=1

I can see the errors on ppc, arm or x86.
But I couldn't reproduce it when I run it manually.

It will be appreciated if any suggestions. Thanks.


On 3/24/22 14:47, Li Zhang wrote:
Hi,

When I run the testsuit on our buidling system, it reports a timeout sometimes not always as the following.
It couldn't connect qmp socket. Any ideas about this problem?


[ 1989s] --- /home/abuild/rpmbuild/BUILD/qemu-6.2.0/tests/qemu-iotests/040.out
[ 1989s] +++ 040.out.bad
[ 1989s] @@ -1,5 +1,55 @@
[ 1989s] -................................................................. [ 1989s] +....ERROR:qemu.aqmp.qmp_client.qemu-6471:Failed to establish connection: asyncio.exceptions.CancelledError [ 1989s] +E..................................ERROR:qemu.aqmp.qmp_client.qemu-6471:Failed to establish connection: asyncio.exceptions.CancelledError
[ 1989s] +E.........................
[ 1989s] +======================================================================
[ 1989s] +ERROR: test_commit_node (__main__.TestActiveZeroLengthImage)
[ 1989s] +----------------------------------------------------------------------
[ 1989s] +Traceback (most recent call last):
[ 1989s] +  File "/home/abuild/rpmbuild/BUILD/qemu-6.2.0/tests/qemu-iotests/040", line 94, in setUp
[ 1989s] +    self.vm.launch()
[ 1989s] +  File "/home/abuild/rpmbuild/BUILD/qemu-6.2.0/python/qemu/machine/machine.py", line 399, in launch
[ 1989s] +    self._launch()
[ 1989s] +  File "/home/abuild/rpmbuild/BUILD/qemu-6.2.0/python/qemu/machine/machine.py", line 434, in _launch
[ 1989s] +    self._post_launch()
[ 1989s] +  File "/home/abuild/rpmbuild/BUILD/qemu-6.2.0/python/qemu/machine/qtest.py", line 147, in _post_launch
[ 1989s] +    super()._post_launch()
[ 1989s] +  File "/home/abuild/rpmbuild/BUILD/qemu-6.2.0/python/qemu/machine/machine.py", line 340, in _post_launch
[ 1989s] +    self._qmp.accept(self._qmp_timer)
[ 1989s] +  File "/home/abuild/rpmbuild/BUILD/qemu-6.2.0/python/qemu/aqmp/legacy.py", line 67, in accept
[ 1989s] +    self._sync(
[ 1989s] +  File "/home/abuild/rpmbuild/BUILD/qemu-6.2.0/python/qemu/aqmp/legacy.py", line 41, in _sync
[ 1989s] +    return self._aloop.run_until_complete(
[ 1989s] +  File "/usr/lib64/python3.8/asyncio/base_events.py", line 616, in run_until_complete
[ 1989s] +    return future.result()
[ 1989s] +  File "/usr/lib64/python3.8/asyncio/tasks.py", line 501, in wait_for
[ 1989s] +    raise exceptions.TimeoutError()
[ 1989s] +asyncio.exceptions.TimeoutError









reply via email to

[Prev in Thread] Current Thread [Next in Thread]