[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
deadlock in NPTL, FUTEX
From: |
Bernhard Ibertsberger |
Subject: |
deadlock in NPTL, FUTEX |
Date: |
Fri, 19 Sep 2008 15:07:19 +0200 |
User-agent: |
Mozilla-Thunderbird 2.0.0.9 (X11/20080110) |
Hi,
since Kernel 2.5 signals seems to contain some trickier pitfalls.
"Native POSIX Threads Library" vs linuxthreads-0.10
see:
people.redhat.com/drepper/futex.pdf
Ok, call a non-reentrant function out of a signal handler is a bad idea
(that was the actual problem in the code of a customer). But now i'd
like to understand the problem more profoundly.
Investigating that issue i found:
http://lwn.net/Articles/124747/
and
http://ubuntuforums.org/showthread.php?t=675821
this demo[1] deadlocks immediately and gives (sometimes) stacktraces like:
Program received signal SIGINT, Interrupt.
0xffffe410 in __kernel_vsyscall ()
(gdb) bt
#0 0xffffe410 in __kernel_vsyscall ()
#1 0x430893ce in __lll_mutex_lock_wait () from /lib/tls/libc.so.6
#2 0x430319c9 in _L_mutex_lock_1945 () from /lib/tls/libc.so.6
#3 0x4300a14b in vsprintf () from /lib/tls/libc.so.6
#4 0x4302fa20 in localtime () from /lib/tls/libc.so.6
#5 0x4302f8e1 in ctime () from /lib/tls/libc.so.6
#6 0x08048491 in handler ()
#7 <signal handler called>
#8 0x42fdebfd in getenv () from /lib/tls/libc.so.6
#9 0x43030aa9 in tzset_internal () from /lib/tls/libc.so.6
#10 0x430317ca in __tz_convert () from /lib/tls/libc.so.6
#11 0x4302fa20 in localtime () from /lib/tls/libc.so.6
#12 0x4302f8e1 in ctime () from /lib/tls/libc.so.6
#13 0x08048550 in main ()
(gdb)
the deadlock comes too if i omit the assignment to the global char *r.
I'd love to know:
* what exactly is the anatomy of the deadlock inside the libc and the
kernel? Does it necessarily need a pagefault or can the deadlock occue
depending on other circumstances?
* where is the context between localtime() und vsprintf()? In the source
of localtime() i can't find anything (just to __tz_convert())?
* where can i find further information concerning this problem?
If i run the demo[2] in form
$ ./printf-hang | grep t
it deadlocks to although there are no calls to ctime. This means the
concurrent printfs to the (non-threadsave) stdout are enough to deadlock.
thx in advance
Bernhard
[1]
___________ ctime-hang.c ___________
#include <sys/time.h>
#include <time.h>
#include <signal.h>
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <string.h>
volatile char *r;
void handler(int sig)
{
time_t t;
time(&t);
r = ctime(&t);
}
int main()
{
struct itimerval it;
struct sigaction sa;
time_t t;
int counter = 0;
memset(&sa, 0, sizeof(sa));
sa.sa_handler = handler;
sigaction(SIGALRM, &sa, NULL);
it.it_value.tv_sec = 0;
it.it_value.tv_usec = 1000;
it.it_interval.tv_sec = 0;
it.it_interval.tv_usec = 1000;
setitimer(ITIMER_REAL, &it, NULL);
while(1) {
counter++;
time(&t);
r = ctime(&t);
printf("Loop %d\n",counter);
}
return 0;
}
---ctime-hang.c end---
[2]
_______printf-hang.c_________
#include <sys/time.h>
#include <time.h>
#include <signal.h>
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <string.h>
#include <pthread.h>
volatile int action_counter;
volatile char *r;
pthread_mutex_t cs_mutex = PTHREAD_MUTEX_INITIALIZER;
void handler(int sig)
{
action_counter++;
printf("-t");
fflush(stdout);
printf("-t");
fflush(stdout);
printf("\n");
}
int main()
{
struct itimerval it;
struct sigaction sa;
time_t t;
int counter = 0;
action_counter = 0;
memset(&sa, 0, sizeof(sa));
sa.sa_handler = handler;
sigaction(SIGALRM, &sa, NULL);
it.it_value.tv_sec = 0;
it.it_value.tv_usec = 10000;
it.it_interval.tv_sec = 0;
it.it_interval.tv_usec = 10000;
setitimer(ITIMER_REAL, &it, NULL);
while(1) {
counter++;
printf("l");
fflush(stdout);
printf("l");
fflush(stdout);
printf("\n");
}
return 0;
}
----printf-hang.c end ------------
- deadlock in NPTL, FUTEX,
Bernhard Ibertsberger <=