[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Hangs in Postgres test suite (was: Re: [PATCH mach v4 1/1] Integrate HPE
From: |
Michael Banck |
Subject: |
Hangs in Postgres test suite (was: Re: [PATCH mach v4 1/1] Integrate HPET so the functions used for getting time can have a higher accuracy) |
Date: |
Sat, 14 Jun 2025 12:36:40 +0200 |
User-agent: |
Mutt/1.10.1 (2018-07-13) |
Hi,
changing the subject, cause this i not related to HPET anymore.
On Sat, Jun 14, 2025 at 01:34:28AM +0000, Damien Zammit wrote:
> How does it crash? Do you get a backtrace?
No, it just hangs without any output.
> What does it actually do during a crash?
It runs the Postgres regression test suite, which runs quite a few
processes in parallel and creates/tears down data directories, so I
think it is pretty much a torture test.
However, I run it via the build-farm client which does CI
(https://wiki.postgresql.org/wiki/PostgreSQL_Buildfarm_Howto), which
sets debug_paralle_query=on by default. If I run "make NO_LOCALE=1
check" manually or override the config to remove that, it does not
usually crash it. I also noticed that the individual test runtimes are
much larger[1] when running it via run_build.pl with
debug_parallel_query=on than manually (of remoing that parameter).
Load goes up to 6-7 with peaks over 10, memory is being used, but
running 'free' in a loop does not show anything alarming.
This was the output of uptime/free/df when it hang:
| 11:13:41 up 12 min, 9 users, load average: 3.87, 5.62, 4.74
| total used free shared buff/cache
available
|Mem: 2041796 860072 1181724 0 0
1181724
|Swap: 975868 0 975868
|Filesystem 1K-blocks Used Available Use% Mounted on
|/dev/hd0s2 4143104 3676732 259220 94% /
> Can you try with a kdb enabled gnumach?
I tried that, but I have to admit I don't have a lot of experience with
it. The debugger is responsive, and if I step/continue it mostly runs
through the idle thread. There were around 80 tasks, ext2fs has around
45 threads and pflocal sometimes has up to 50-70 threads. Except for a
few in thread_bootstrap_return from postgres tasks, most of the
postgres-related tasks are in either mach_msg_continue or
mach_msg_receive_continue.
Let me know what I should be looking out for. But I guess
Michael
[1]
manually:
|echo "# +++ regress check in src/test/regress +++" &&
PATH="/home/demo/postgres/tmp_install/home/demo/build-farm-19.1/buildroot/HEAD/inst/bin:/home/demo/postgres/src/test/regress:$PATH"
LD_LIBRARY_PATH="/home/de
|mo/postgres/tmp_install/home/demo/build-farm-19.1/buildroot/HEAD/inst/lib"
INITDB_TEMPLATE='/home/demo/postgres'/tmp_install/initdb-template
../../../src/test/regress/pg_regress --temp-instance=./tmp_check --in
|putdir=. --bindir= --no-locale --dlpath=. --max-concurrent-tests=20
--schedule=./parallel_schedule
|# +++ regress check in src/test/regress +++
|# initializing database system by running initdb
|# using temp instance on port 65312 with PID 3380
|ok 1 - test_setup 400 ms
|# parallel group (20 tests): int2 pg_lsn boolean money oid txid int8 regproc
float4 float8 char text name varchar enum int4 bit uuid numeric rangetypes
|ok 2 + boolean 380 ms
|ok 3 + char 610 ms
|ok 4 + name 630 ms
|ok 5 + varchar 640 ms
|ok 6 + text 620 ms
|ok 7 + int2 300 ms
|ok 8 + int4 730 ms
|ok 9 + int8 480 ms
|ok 10 + oid 420 ms
|ok 11 + float4 530 ms
|ok 12 + float8 590 ms
|ok 13 + bit 840 ms
|ok 14 + numeric 1450 ms
|ok 15 + txid 450 ms
|ok 16 + uuid 960 ms
|ok 17 + enum 710 ms
|ok 18 + money 400 ms
|ok 19 + rangetypes 1580 ms
|ok 20 + pg_lsn 300 ms
|ok 21 + regproc 480 ms
run_build.pl:
|echo "# +++ regress check in src/test/regress +++" &&
PATH="/home/demo/postgres/tmp_install/home/demo/build-farm-19.1/buildroot/HEAD/inst/bin:/home/demo/postgres/src/test/regress:$PATH"
LD_LIBRARY_PATH="/home/de
|mo/postgres/tmp_install/home/demo/build-farm-19.1/buildroot/HEAD/inst/lib"
INITDB_TEMPLATE='/home/demo/postgres'/tmp_install/initdb-template
../../../src/test/regress/pg_regress --temp-instance=./tmp_check --in
|putdir=. --bindir=
--temp-config=/home/demo/build-farm-19.1/buildroot/tmp/buildfarm-heJzFo/bfextra.conf
--no-locale --port=5678 --dlpath=. --max-concurrent-tests=20 --port=5678
--schedule=./parallel_schedule
|# +++ regress check in src/test/regress +++
|# initializing database system by running initdb
|# using temp instance on port 5678 with PID 10923
|ok 1 - test_setup 440 ms
|# parallel group (20 tests): txid pg_lsn char int4 varchar oid name regproc
boolean enum uuid float4 float8 money int2 bit int8 text rangetypes numeric
|ok 2 + boolean 9090 ms
|ok 3 + char 1380 ms
|ok 4 + name 2530 ms
|ok 5 + varchar 1500 ms
|ok 6 + text 14600 ms
|ok 7 + int2 13850 ms
|ok 8 + int4 1460 ms
|ok 9 + int8 14160 ms
|ok 10 + oid 2460 ms
|ok 11 + float4 12490 ms
|ok 12 + float8 12690 ms
|ok 13 + bit 13950 ms
|ok 14 + numeric 18530 ms
|ok 15 + txid 1030 ms
|ok 16 + uuid 10360 ms
|ok 17 + enum 9170 ms
|ok 18 + money 13500 ms
|ok 19 + rangetypes 16410 ms
|ok 20 + pg_lsn 1320 ms
|ok 21 + regproc 9080 ms
- Re: [PATCH mach v4 1/1] Integrate HPET so the functions used for getting time can have a higher accuracy, (continued)
- Re: [PATCH mach v4 1/1] Integrate HPET so the functions used for getting time can have a higher accuracy, Zhaoming Luo, 2025/06/12
- Re: [PATCH mach v4 1/1] Integrate HPET so the functions used for getting time can have a higher accuracy, Michael Banck, 2025/06/12
- Re: [PATCH mach v4 1/1] Integrate HPET so the functions used for getting time can have a higher accuracy, Damien Zammit, 2025/06/12
- Re: [PATCH mach v4 1/1] Integrate HPET so the functions used for getting time can have a higher accuracy, Michael Banck, 2025/06/13
- Re: [PATCH mach v4 1/1] Integrate HPET so the functions used for getting time can have a higher accuracy, Zhaoming Luo, 2025/06/13
- Re: [PATCH mach v4 1/1] Integrate HPET so the functions used for getting time can have a higher accuracy, Michael Banck, 2025/06/13
- Re: [PATCH mach v4 1/1] Integrate HPET so the functions used for getting time can have a higher accuracy, Damien Zammit, 2025/06/13
- Re: [PATCH mach v4 1/1] Integrate HPET so the functions used for getting time can have a higher accuracy, Samuel Thibault, 2025/06/14
- Hangs in Postgres test suite (was: Re: [PATCH mach v4 1/1] Integrate HPET so the functions used for getting time can have a higher accuracy),
Michael Banck <=
- Re: [PATCH mach v4 1/1] Integrate HPET so the functions used for getting time can have a higher accuracy, Michael Banck, 2025/06/15