qemu-block
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH 6/6] gitlab-ci.d/buildtest: Disintegrate the build-coroutine-


From: Juan Quintela
Subject: Re: [PATCH 6/6] gitlab-ci.d/buildtest: Disintegrate the build-coroutine-sigaltstack job
Date: Mon, 06 Feb 2023 09:46:26 +0100
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/28.2 (gnu/linux)

Thomas Huth <thuth@redhat.com> wrote:
> On 03/02/2023 22.14, Juan Quintela wrote:
>> Peter Maydell <peter.maydell@linaro.org> wrote:
>>> On Fri, 3 Feb 2023 at 15:44, Thomas Huth <thuth@redhat.com> wrote:
>>>>
>>>> On 03/02/2023 13.08, Kevin Wolf wrote:
>>>>> Am 03.02.2023 um 12:23 hat Thomas Huth geschrieben:
>>>>>> On 30/01/2023 11.58, Daniel P. Berrangé wrote:
>>>>>>> On Mon, Jan 30, 2023 at 11:44:46AM +0100, Thomas Huth wrote:
>>>>>>>> We can get rid of the build-coroutine-sigaltstack job by moving
>>>>>>>> the configure flags that should be tested here to other jobs:
>>>>>>>> Move --with-coroutine=sigaltstack to the build-without-defaults job
>>>>>>>> and --enable-trace-backends=ftrace to the cross-s390x-kvm-only job.
>>>>>>>
>>>>>>> The biggest user of coroutines is the block layer. So we probably
>>>>>>> ought to have coroutines aligned with a job that triggers the
>>>>>>> 'make check-block' for iotests.  IIUC,  the without-defaults
>>>>>>> job won't do that. How about, arbitrarily, using either the
>>>>>>> 'check-system-debian' or 'check-system-ubuntu' job. Those distros
>>>>>>> are closely related, so getting sigaltstack vs ucontext coverage
>>>>>>> between them is a good win, and they both trigger the block jobs
>>>>>>> IIUC.
>>>>>>
>>>>>> I gave it a try with the ubuntu job, but this apparently trips up the 
>>>>>> iotests:
>>>>>>
>>>>>>    https://gitlab.com/thuth/qemu/-/jobs/3705965062#L212
>>>>>>
>>>>>> Does anybody have a clue what could be going wrong here?
>>>>>
>>>>> I'm not sure how changing the coroutine backend could cause it, but
>>>>> primarily this looks like an assertion failure in migration code.
>>>>>
>>>>> Dave, Juan, any ideas what this assertion checks and why it could be
>>>>> failing?
>>>>
>>>> Ah, I think it's the bug that will be fixed by:
>>>>
>>>>    
>>>> 20230202160640.2300-2-quintela@redhat.com/">https://lore.kernel.org/qemu-devel/20230202160640.2300-2-quintela@redhat.com/
>>>>
>>>> The fix hasn't hit the master branch yet (I think), and I had another patch
>>>> in my CI that disables the aarch64 binary in that runner, so the iotests
>>>> suddenly have been executed with the alpha binary there --> migration 
>>>> fails.
>>>>
>>>> So never mind, it will be fixed as soon as Juan's pull request gets 
>>>> included.
>>>
>>> The migration tests have been flaky for a while now,
>>> including setups where host and guest page sizes are the same.
>>> (For instance, my x86 macos box pretty reliably sees failures
>>> when the machine is under load.)
>> I *thought* that we had fixed all of those.
>> But it is difficult for me to know because:
>> - I only happens when one runs "make check"
>> - running ./migration-test have never failed to me
>> - When it fails (and it has been a while since it has failed to me)
>>    it is impossible to me to detect what is going on, and as said, I have
>>    never been able to reproduce running only migration-test.
>> I will try to run several at the same time and see if it happens.
>> And as Thomas said, I *think* that the fix that Peter Xu posted
>> should
>> fix this issue.  Famous last words.
>
> The patch from Peter should fix my problems that I triggered via the
> iotests - but the migration-qtest is still unstable independent from
> that issue, I think. See for example the latest staging pipeline:
>
>  https://gitlab.com/qemu-project/qemu/-/pipelines/767961842
>
> The migration qtest failed in both, the x86-freebsd-build and the
> ubuntu-20.04-s390x-all pipelin.
>
>  Thomas

 31/659 qemu:qtest+qtest-aarch64 / qtest-aarch64/migration-test                 
  ERROR          48.23s   killed by signal 6 SIGABRT
>>> G_TEST_DBUS_DAEMON=/home/gitlab-runner/builds/-LCfcJ2T/0/qemu-project/qemu/tests/dbus-vmstate-daemon.sh
>>>  QTEST_QEMU_IMG=./qemu-img QTEST_QEMU_BINARY=./qemu-system-aarch64 
>>> MALLOC_PERTURB_=124 
>>> QTEST_QEMU_STORAGE_DAEMON_BINARY=./storage-daemon/qemu-storage-daemon 
>>> /home/gitlab-runner/builds/-LCfcJ2T/0/qemu-project/qemu/build/tests/qtest/migration-test
>>>  --tap -k
――――――――――――――――――――――――――――――――――――― ✀  ―――――――――――――――――――――――――――――――――――――
stderr:
Broken pipe
../tests/qtest/libqtest.c:190: kill_qemu() detected QEMU death from signal 11 
(Segmentation fault) (core dumped)
TAP parsing error: Too few tests run (expected 41, got 12)
(test program exited with status code -6)
――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――

I don't know hat to do with this:
- this is aarch64 tcg
- this *works* on f37, or at least I can't reproduce any error with make
  check on my box, and I *think* my configuration is quite extensive (as
  far as I know everything that can be compiled in fedora with packages
  in the distro):

configure file: /mnt/code/qemu/full/configure
--enable-trace-backends=log
        --prefix=/usr
        --sysconfdir=/etc/sysconfig/
        --audio-drv-list=pa,alsa
        --with-coroutine=ucontext
        --with-git-submodules=validate
        --enable-alsa
        --enable-attr
        --enable-auth-pam
        --enable-avx2
        --enable-avx512f
        --enable-bochs
        --enable-bpf
        --enable-brlapi
        --disable-bsd-user
        --enable-bzip2
        --enable-cap-ng
        --enable-capstone
        --disable-cfi
        --disable-cfi-debug
        --enable-cloop
        --disable-cocoa
        --enable-containers
        --disable-coreaudio
        --enable-coroutine-pool
        --enable-crypto-afalg
        --enable-curl
        --enable-curses
        --enable-dbus-display
        --enable-debug-info
        --disable-debug-mutex
        --disable-debug-stack-usage
        --disable-debug-tcg
        --enable-dmg
        --enable-docs
        --disable-dsound
        --enable-fdt
        --enable-fuse
        --enable-fuse-lseek
        --disable-fuzzing
        --disable-gcov
        --disable-gcrypt
        --enable-gettext
        --enable-gio
        --enable-glusterfs
        --enable-gnutls
        --disable-gprof
        --enable-gtk
        --enable-guest-agent
        --disable-guest-agent-msi
        --disable-hax
        --disable-hvf
        --enable-iconv
        --enable-install-blobs
        --enable-jack
        --enable-keyring
        --enable-kvm
        --enable-l2tpv3
        --enable-libdaxctl
        --enable-libiscsi
        --enable-libnfs
        --enable-libpmem
        --enable-libssh
        --enable-libudev
        --enable-libusb
        --enable-linux-aio
        --enable-linux-io-uring
        --enable-linux-user
        --enable-live-block-migration
        --disable-lto
        --disable-lzfse
        --enable-lzo
        --disable-malloc-trim
        --enable-membarrier
        --enable-module-upgrades
        --enable-modules
        --enable-mpath
        --enable-multiprocess
        --disable-netmap
        --enable-nettle
        --enable-numa
        --disable-nvmm
        --enable-opengl
        --enable-oss
        --enable-pa
        --enable-parallels
        --enable-pie
        --enable-plugins
        --enable-png
        --disable-profiler
        --enable-pvrdma
        --enable-qcow1
        --enable-qed
        --disable-qom-cast-debug
        --enable-rbd
        --enable-rdma
        --enable-replication
        --enable-rng-none
        --disable-safe-stack
        --disable-sanitizers
        --enable-stack-protector
        --enable-sdl
        --enable-sdl-image
        --enable-seccomp
        --enable-selinux
        --enable-slirp
        --enable-slirp-smbd
        --enable-smartcard
        --enable-snappy
        --enable-sparse
        --enable-spice
        --enable-spice-protocol
        --enable-system
        --enable-tcg
        --disable-tcg-interpreter
        --enable-tools
        --enable-tpm
        --disable-tsan
        --disable-u2f
        --enable-usb-redir
        --enable-user
        --disable-vde
        --enable-vdi
        --enable-vhost-crypto
        --enable-vhost-kernel
        --enable-vhost-net
        --enable-vhost-user
        --enable-vhost-user-blk-server
        --enable-vhost-vdpa
        --enable-virglrenderer
        --enable-virtfs
        --enable-virtiofsd
        --enable-vnc
        --enable-vnc-jpeg
        --enable-vnc-sasl
        --enable-vte
        --enable-vvfat
        --enable-werror
        --disable-whpx
        --enable-xen
        --enable-xen-pci-passthrough
        --enable-xkbcommon
        --enable-zstd

- It gives a segmentation fault.  Nothing else.

Can we get at least a backtrace to work from there?

Thanks, Juan.





reply via email to

[Prev in Thread] Current Thread [Next in Thread]