[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [monit] Can't restart monit

From: Martin Pala
Subject: Re: [monit] Can't restart monit
Date: Wed, 21 Jan 2009 20:52:33 +0100

On Jan 21, 2009, at 1:27 PM, Nicola Tiling wrote:

Am 20.01.2009 um 22:27 schrieb Martin Pala:

The root cause most probably is, that monit masks signals (including SIGTERM) for critical sections. Freebsd's /etc/rc.subr routines which are used by monit's rc script performs the restart by stopping and starting the process. The stop is performed by sending SIGTERM to the process. When Monit is in critical section, it can thus ignore the SIGTERM signal whereas the rc script thinks that it managed to signalize process to stop and will wait for pid to exit.

I think monit freezes. Today the last entry in /var/log/monit is from [ Jan 21 03:26:04] and mmonit says "No report from monit. Last report was Wed, 21 Jan 2009 03:24:03". The webconsole of monit itself is reachable, but no watched processes are clickable.

If it will happen again (with fixed version), please can you get thread stacktraces using?:

  gdb monit <monit's pid>

Example (monit running with pid 941):

# gdb /usr/local/bin/monit 941

(gdb) info threads
  2 process 941 thread 0x903  0x96fcb6f2 in select$DARWIN_EXTSN ()
* 1 process 941 thread 0x203  0x96f833ae in __semwait_signal ()

(gdb) thr 1
[Switching to thread 1 (process 941 thread 0x203)]
0x96f833ae in __semwait_signal ()

(gdb) bt
#0  0x96f833ae in __semwait_signal ()
#1  0x96f8322f in nanosleep$UNIX2003 ()
#2  0x96fd8e71 in sleep$UNIX2003 ()
#3  0x00009ade in main (argc=3, argv=0xbffffd18) at monitor.c:503

(gdb) thr 2
[Switching to thread 2 (process 941 thread 0x903)]
0x96fcb6f2 in select$DARWIN_EXTSN ()

(gdb) bt
#0  0x96fcb6f2 in select$DARWIN_EXTSN ()
#1  0x0000a25c in can_read (socket=3, timeout=0) at net.c:487
#2  0x0002183d in socket_producer [inlined] () at http/engine.c:637
#3 0x0002183d in start_httpd (port=2812, backlog=0, bindAddr=0x0) at http/engine.c:207
#4  0x00007596 in thread_wrapper (arg=0x0) at http.c:177
#5  0x96fad095 in _pthread_start ()
#6  0x96facf52 in thread_start ()


reply via email to

[Prev in Thread] Current Thread [Next in Thread]