[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-arm] [BUG] qemu-system-arm/aarch64 QMP hangs after kernel boot
From: |
Cleber Rosa |
Subject: |
Re: [Qemu-arm] [BUG] qemu-system-arm/aarch64 QMP hangs after kernel boots |
Date: |
Tue, 21 May 2019 16:13:18 -0400 (EDT) |
----- Original Message -----
> From: "Auger Eric" <address@hidden>
> To: "Cleber Rosa" <address@hidden>
> Cc: address@hidden, "Peter Maydell" <address@hidden>
> Sent: Monday, May 20, 2019 3:21:18 PM
> Subject: Re: [BUG] qemu-system-arm/aarch64 QMP hangs after kernel boots
>
> Hi Cleber,
>
> On 5/20/19 7:47 PM, Cleber Rosa wrote:
> > Hi,
> >
> > This is just a FIY about the following bug report:
> >
> > - https://bugs.launchpad.net/qemu/+bug/1829779
> >
> > And add some extra clarifications that I've tested extensively the
> > behavior of other targets (all that have a similar test under
> > tests/acceptance/boot_linux_console.py) and none exhibited the same
> > behavior.
> >
> > Also, because qemu-system-arm and aarch64 sometimes fails on the very
> > first QMP command, the regular boot test sometimes will hang on a
> > "quit" command that happens when the test run its "tear down" and
> > attempts to cleanly shut down the "vm".
> >
> > Let me know if I can be of any further help.
>
> Thanks for reporting the problem. I hit the problem recently with
> device_del commands which did not seem to get any response on guest
> side. On my side, this happened erratically.
>
> I will try to reproduce it again.
>
> Thanks
>
> Eric
>
I have an update on this. Eric and myself attempted to zero in the
exact cause. A few things we discovered:
1) It has nothing to do with having a kernel running
2) It has to do with having a chardev that is a server socket. This
test produces command line arguments such as:
-chardev socket,id=console,path=<path>.sock,server,nowait \
-serial chardev:console
3) It doesn't seem to have a connection to the test infrastructure code
(python/qemu/qmp/*), as a I made a number of experiments which
yielded no differences in behavior.
So, the reproducer given at:
https://github.com/clebergnu/qemu/commit/c778e28c24030c4a36548b714293b319f4bf18df
Continues to be be valid (and continues to be limited to arm and aarch64).
Now, after a number of experiments, the following was found to be a 100%
reproducible *workaround*:
https://github.com/clebergnu/qemu/commit/e1713f3b91972ad57c089f276c54db3f3fa63423
That basically shutdowns the *console* socket before proceeding with further QMP
interaction. The effectiveness of the workaround can be seen here:
aarch64 command line:
- https://travis-ci.org/clebergnu/qemu/jobs/535459499#L3633
aarch64 QMP interaction:
- https://travis-ci.org/clebergnu/qemu/jobs/535459499#L3663
arm command line:
- https://travis-ci.org/clebergnu/qemu/jobs/535459499#L3747
arm QMP interaction:
- https://travis-ci.org/clebergnu/qemu/jobs/535459499#L3767
I hope this provides a few more hints into the real issue.
Regards,
- Cleber.