[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: test-pthread-rwlock failure on Pop!_OS 22.04 LTS
From: |
Bruno Haible |
Subject: |
Re: test-pthread-rwlock failure on Pop!_OS 22.04 LTS |
Date: |
Wed, 14 Aug 2024 16:53:42 +0200 |
Pádraig Brady wrote:
> It failed again the same way with the latest gnulib.
> Note the test is run with `make -j24 check` from crontab.
Yeah, this test is known to fail under load. There is no way to make it
100% reliable under high load. Even if I was to choose a STEP_INTERVAL of
1 second, it would probably fail under a load of 500 or so.
> I did however notice different output depending on
> whether the test was run in the foreground or background
This might hint to effects of how the scheduler works.
> If I run the test in the foreground with
> ./gnulib-tests/test-pthread-rwlock-waitqueue
> I don't see any of the "... => ..." lines output.
These "... => ..." lines are debugging helps in case some result
is unexpected. But this happens only when the load is high. For example:
WRR => W1 R2 R3
WRRR => W1 R2 R3 R4
WRRRR => R2 W1 R4 R5 R3
means that, after one thread enqueues a request for locking as a writer
(while no reader is present), then after STEP_INTERVAL, another thread
enqueues a request for locking as a reader, the second thread gets the
lock. This is only possible if the kernel has not reacted to the first
request within STEP_INTERVAL.
I'm committing this workaround. Which has the effect of excluding the
test from coreutils (unless you are using --with-longrunning-tests).
2024-08-14 Bruno Haible <bruno@clisp.org>
pthread-rwlock-extra-tests: Exclude this test from packages by default.
* tests/test-pthread-rwlock-waitqueue.c (STEP_INTERVAL): Add comment.
* modules/pthread-rwlock-extra-tests (Status): Mark as longrunning-test.
diff --git a/modules/pthread-rwlock-extra-tests
b/modules/pthread-rwlock-extra-tests
index c14e3ed8ac..2d93488dfd 100644
--- a/modules/pthread-rwlock-extra-tests
+++ b/modules/pthread-rwlock-extra-tests
@@ -1,3 +1,6 @@
+Status:
+longrunning-test
+
Files:
tests/test-pthread-rwlock-waitqueue.c
tests/macros.h
diff --git a/tests/test-pthread-rwlock-waitqueue.c
b/tests/test-pthread-rwlock-waitqueue.c
index ad190b5491..6b800ea5cc 100644
--- a/tests/test-pthread-rwlock-waitqueue.c
+++ b/tests/test-pthread-rwlock-waitqueue.c
@@ -48,7 +48,23 @@
flavours of read-write locks. */
/* Some platforms need a longer STEP_INTERVAL, otherwise some of the assertions
- RRR, RRRR, RRRRR fail. */
+ RRR, RRRR, RRRRR fail.
+ Note: The probability of failing these assertions is higher when the machine
+ is under high load. It can be worked around by increasing the
STEP_INTERVAL.
+ However, increasing the STEP_INTERVAL means to increase the total duration
+ of this test:
+ STEP_INTERVAL Duration (on glibc/Linux)
+ 10 ms 29 sec
+ 20 ms 57 sec
+ 50 ms 2.4 min
+ 100 ms 4.8 min
+ 200 ms 9.6 min
+ There is no way to have this test be reasonably fast and 100% reliable at
the
+ same time. Therefore the compromise we have chosen is
+ - to pick STEP_INTERVAL so that the test succeeds on developer machines
+ with little load and on continuous integration machines,
+ - to exclude the test from packaging, unless the gnulib-tool option
+ '--with-longrunning-tests' is specified. */
#if (defined __APPLE__ && defined __MACH__)
/* macOS */
# define STEP_INTERVAL 200000000 /* nanoseconds */