[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Test-lock hang (not 100% reproducible) on GNU/Linux
From: |
Pavel Raiskup |
Subject: |
Re: Test-lock hang (not 100% reproducible) on GNU/Linux |
Date: |
Wed, 04 Jan 2017 15:48:53 +0100 |
User-agent: |
KMail/5.3.3 (Linux/4.8.15-300.fc25.x86_64; KDE/5.27.0; x86_64; ; ) |
On Wednesday, January 4, 2017 3:17:01 PM CET Bruno Haible wrote:
> Pádraig Brady:
> > Now that test-lock.c is relatively fast on numa/multicore systems,
> > it seems like it would be useful to first alarm(30) or something
> > to protect against infinite hangs?
>
> If we could not pinpoint the origin of the problem, I agree, an alarm(30)
> would be the right thing to prevent an infinite hang.
>
> But by now, we know
>
> 1) It's a glibc bug: The test [6] fails even after it has set the
> policies that POSIX expects for the "writers get the rwlock in preference
> to readers guarantee".
>
> 2) Without this guarantee, a reader function that repeatedly spends
> I milliseconds in a section protected by the rwlock,
> O milliseconds without the rwlock being held,
> in a system with N reader threads in parallel
> will lead to
> - a successful termination if N * I / (I + O) < 1.0
> - an infinite hang if N * I / (I + O) > 1.0
> (There is actually no discontinuity at 1.0; need to use probability
> calculus for a more detailed analysis.)
> So, in order to make test_rwlock hang-tree, I would need to introduce
> a sleep() without the rwlock being held, and the duration of this sleep
> would be at least (N - 1) * I.
>
> Now, asking an application writer to add sleep()s in his code, with
> a duration that depends both on the number of threads and on the time
> spent in specific portions of the code, is outrageous.
>
> So, as it stands, POSIX rwlock without a "writers get preference" guarantee
> is unusable.
If we don't played with probability a bit longer, I'm still afraid this
moves the problem somewhere else ... because if writers had preference,
and those were able to held rwlock all the time, readers would starve.
I agree that gl_pthread_rwlock* should match the specification, but at
least in the actual algorithm in test_rwlock() we should make sure that
some readers are actually doing something _during_ writers' typhoon...
(this is not hang of test_rwlock() anymore, but certainly we want to test
something..).
Pavel
> I propose to do what we usually do in gnulib, to work around bugs and unusable
> APIs:
> - Write a configure test for the guarantee, based on [6].
> - Modify the 'lock' module to use its own implementation of rwlock.
> - Add a unit test to verify the guarantee (so that we can also detect
> if the same problem occurs in pth or Solaris), again based on [6].
>
> Patch in preparation...
>
> Bruno
>
> [6]
> https://github.com/linux-test-project/ltp/blob/master/testcases/open_posix_testsuite/conformance/interfaces/pthread_rwlock_rdlock/2-2.c
- Re: Test-lock hang (not 100% reproducible) on GNU/Linux, Pavel Raiskup, 2017/01/02
- Re: Test-lock hang (not 100% reproducible) on GNU/Linux, Bruno Haible, 2017/01/02
- Re: Test-lock hang (not 100% reproducible) on GNU/Linux, Pavel Raiskup, 2017/01/02
- Re: Test-lock hang (not 100% reproducible) on GNU/Linux, Bruno Haible, 2017/01/03
- Re: Test-lock hang (not 100% reproducible) on GNU/Linux, Pavel Raiskup, 2017/01/04
- Re: Test-lock hang (not 100% reproducible) on GNU/Linux, Bruno Haible, 2017/01/04
- Re: Test-lock hang (not 100% reproducible) on GNU/Linux, Pádraig Brady, 2017/01/04
- Re: Test-lock hang (not 100% reproducible) on GNU/Linux, Bruno Haible, 2017/01/04
- Re: Test-lock hang (not 100% reproducible) on GNU/Linux,
Pavel Raiskup <=
- Re: Test-lock hang (not 100% reproducible) on GNU/Linux, Bruno Haible, 2017/01/04
- Re: Test-lock hang (not 100% reproducible) on GNU/Linux, Pavel Raiskup, 2017/01/04
- Re: Test-lock hang (not 100% reproducible) on GNU/Linux, Pavel Raiskup, 2017/01/04
- Re: Test-lock hang (not 100% reproducible) on GNU/Linux, Bruno Haible, 2017/01/04
Re: Test-lock hang (not 100% reproducible) on GNU/Linux, Bruno Haible, 2017/01/05