[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: runaway avocado
Re: runaway avocado
Thu, 11 Feb 2021 12:25:41 -0500
On Fri, Feb 05, 2021 at 07:23:22PM +0000, Peter Maydell wrote:
> On Mon, 26 Oct 2020 at 22:35, Peter Maydell <email@example.com> wrote:
> > So, I somehow ended up with this process still running on my
> > local machine after a (probably failed) 'make check-acceptance':
> > petmay01 13710 99.7 3.7 2313448 1235780 pts/16 Sl 16:10 378:00
> > ./qemu-system-aarch64 -display none -vga none -chardev
> > socket,id=mon,path=/var/tmp/tmp5szft2yi/qemu-13290-monitor.sock -mon
> > chardev=mon,mode=control -machine virt -chardev
> > socket,id=console,path=/var/tmp/tmp5szft2yi/qemu-13290-console.sock,server,nowait
> > -serial chardev:console -icount
> > shift=7,rr=record,rrfile=/var/tmp/avocado_iv8dehpo/avocado_job_w9efukj5/32-tests_acceptance_reverse_debugging.py_ReverseDebugging_AArch64.test_aarch64_virt/replay.bin,rrsnapshot=init
> > -net none -drive
> > file=/var/tmp/avocado_iv8dehpo/avocado_job_w9efukj5/32-tests_acceptance_reverse_debugging.py_ReverseDebugging_AArch64.test_aarch64_virt/disk.qcow2,if=none
> > -kernel
> > /home/petmay01/avocado/data/cache/by_location/a00ac4ae676ef0322126abd2f7d38f50cc9cbc95/vmlinuz
> > -cpu cortex-a53
> > and it was continuing to log to a deleted file
> > /var/tmp/avocado_iv8dehpo/avocado_job_w9efukj5/32-tests_acceptance_reverse_debugging.py_ReverseDebugging_AArch64.test_aarch64_virt/replay.bin
> > which was steadily eating my disk space and got up to nearly 100GB
> > in used disk (invisible to du, of course, since it was an unlinked
> > file) before I finally figured out what was going on and killed it
> > about six hours later...
> Just got hit by this test framework bug again :-( Same thing,
> runaway avacado record-and-replay test ate all my disk space.
> -- PMM
I'm sorry this caused you trouble again.
IIUC, this specic issue was caused by a runaway QEMU. Granted, it was
started by an Avocado test. I've opened a bug report to look into the
possibilities to mitigate or prevent this from happening again:
The bug report contains a bit more context into why Avocado does not
try to kill all processes started by a test by default.
BTW, we've been working with Pavel on identifying issues with
replay/reverse features that are causing test failures. So far,
I've seen a couple of issues that may be related to this runaway
QEMU writing to to the replay.bin file.
Description: PGP signature