|
From: | Sasha Yohananov |
Subject: | Re: Cannot get Monit to run more than 60 seconds |
Date: | Thu, 16 Dec 2010 15:02:26 +0200 |
Hi everyone..
I've about reached the end of my road here trying to get Monit to run, and at this point,
I'm simply going 'uncle' and posting for help. I have Googled, I have read documentation,
I have studied examples, all to no avail so far. The app runs for the specified 60 second
'wait' period in my monitrc, then goes away. No matter what I've tried, it's the exact
same result.Let me begin by saying I followed this guide here:
http://www.howtoforge.com/server-monitoring-with-munin-and-monit-on-centos-5.2-p2I went through the setup for a 64 bit box with CentOS 5 Final. Every step matched what was
documented to the 'T'. After doing the SSL certs, the website said "finally, we can start
Monit: /etc/init.d/monit start", which I did. It complained my mysqld wasn't in the right
path, nor my postfix. I just commented those entries out to come back to them later, and
restarted the daemon. It seemed to grab, as a ps aux | grep monit showed it running, and
/etc/init.d/monit status confirmed it. I opened a browser and pointed it to my box with
the proper port, but got nothing. Went back to the running processes and found Monit dead.Going through the monit.log, I saw there was an id error, because the folder expected to
hold the id wasn't there. I created it, re-ran the daemon, and this time it reported that
it wrote a unique id file to the directory I created, and it was once again running. 60
seconds later, it was dead again. The monit.log revealed nothing out of the ordinary, here
is what a cycle of start -> dead looks like in the log:[EST Dec 15 14:11:26] info : monit: generated unique Monit id 99655fc9cc168e531b8d9734cab746b9 and stored to '/var/monit/id'
[EST Dec 15 14:11:26] info : Starting monit daemon with http interface at [*:2812]
[EST Dec 15 14:11:26] info : Monit start delay set -- pause for 60s
[EST Dec 15 14:12:26] info : Starting monit HTTP server at [*:2812]I then started running the daemon in the foreground with noise, and frankly, if the problem
is revealed in there, I don't see it. Here's that:$/usr/bin/monit -d 10 -c /etc/monit.d/monitrc -v -l /var/log/monit.log
monit: Debug: Adding net allow '{my_home_ip_here}'.
monit: Debug: Adding credentials for user 'admin'.
Runtime constants:
Control file = /etc/monit.d/monitrc
Log file = /var/log/monit.log
Pid file = /var/run/monit.pid
Debug = True
Log = True
Use syslog = False
Is Daemon = True
Use process engine = True
Poll time = 10 seconds with start delay 0 seconds
Expect buffer = 256 bytes
Mail from = (not defined)
Mail subject = (not defined)
Mail message = (not defined)
Start monit httpd = True
httpd bind address = Any/All
httpd portnumber = 2812
httpd signature = True
Use ssl encryption = True
PEM key/cert file = /var/certs/monit.pem
Client cert file = None
Allow self certs = False
httpd auth. style = Basic Authentication and Host/Net allow listThe service list contains the following entries:
Process Name = proftpd
Pid file = /var/run/proftpd.pid
Monitoring mode = active
Start program = '/etc/init.d/proftpd start' timeout 30 second(s)
Stop program = '/etc/init.d/proftpd stop' timeout 30 second(s)
Existence = if does not exist 1 times within 1 cycle(s) then restart else if succeeded 1 times within 1 cycle(s) then alert
Pid = if changed 1 times within 1 cycle(s) then alert
Ppid = if changed 1 times within 1 cycle(s) then alert
Port = if failed localhost:21 [FTP via TCP] with timeout 5 seconds 1 times within 1 cycle(s) then restart else if succeeded 1 times within 1 cycle(s) then alert
Timeout = If restarted 5 times within 5 cycle(s) then unmonitorProcess Name = sshd
Pid file = /var/run/sshd.pid
Monitoring mode = active
Start program = '/etc/init.d/sshd start' timeout 30 second(s)
Stop program = '/etc/init.d/sshd stop' timeout 30 second(s)
Existence = if does not exist 1 times within 1 cycle(s) then restart else if succeeded 1 times within 1 cycle(s) then alert
Pid = if changed 1 times within 1 cycle(s) then alert
Ppid = if changed 1 times within 1 cycle(s) then alert
Port = if failed localhost:22 [SSH via TCP] with timeout 5 seconds 1 times within 1 cycle(s) then restart else if succeeded 1 times within 1 cycle(s) then alert
Timeout = If restarted 5 times within 5 cycle(s) then unmonitorProcess Name = apache
Group = www
Pid file = /var/run/httpd.pid
Monitoring mode = active
Start program = '/etc/init.d/httpd start' timeout 30 second(s)
Stop program = '/etc/init.d/httpd stop' timeout 30 second(s)
Existence = if does not exist 1 times within 1 cycle(s) then restart else if succeeded 1 times within 1 cycle(s) then alert
Pid = if changed 1 times within 1 cycle(s) then alert
Ppid = if changed 1 times within 1 cycle(s) then alert
Port = if failed www.ezcommunities.com:80/monit/token [HTTP via TCP] with timeout 5 seconds 1 times within 1 cycle(s) then restart else if succeeded 1 times within 1 cycle(s) then alert
Load avg. (5min) = if greater than 10.0 8 times within 8 cycle(s) then stop else if succeeded 1 times within 1 cycle(s) then alert
Children = if greater than 250 1 times within 1 cycle(s) then restart else if succeeded 1 times within 1 cycle(s) then alert
CPU usage limit = if greater than 80.0% 5 times within 5 cycle(s) then restart else if succeeded 1 times within 1 cycle(s) then alert
CPU usage limit = if greater than 60.0% 2 times within 2 cycle(s) then alert else if succeeded 1 times within 1 cycle(s) then alert
Timeout = If restarted 3 times within 5 cycle(s) then unmonitorSystem Name = system_{myexample.site.com}
Monitoring mode = active-------------------------------------------------------------------------------
Starting monit daemon with http interface at [*:2812]monit.log says:
[EST Dec 16 02:27:04] info : Starting monit daemon with http interface at [*:2812]
[EST Dec 16 02:27:04] info : Starting monit HTTP server at [*:2812]
[EST Dec 16 02:27:04] info : monit HTTP server started
[EST Dec 16 02:27:04] info : 'system_{myexample.site.com}' Monit started/etc/init.d/monit status says:
monit dead but pid file existsFor completeness, here is monitrc:
set daemon 60 with start delay 60
set logfile /var/log/monit.log
# set mailserver localhost
# set mail-format { from: address@hidden} }
# set alert address@hidden
set httpd port 2812 and
SSL ENABLE
PEMFILE /var/certs/monit.pem
allow {my_home_ip_here}
allow admin:testcheck process proftpd with pidfile /var/run/proftpd.pid
start program = "/etc/init.d/proftpd start"
stop program = "/etc/init.d/proftpd stop"
if failed port 21 protocol ftp then restart
if 5 restarts within 5 cycles then timeoutcheck process sshd with pidfile /var/run/sshd.pid
start program "/etc/init.d/sshd start"
stop program "/etc/init.d/sshd stop"
if failed port 22 protocol ssh then restart
if 5 restarts within 5 cycles then timeout# check process mysql with pidfile /var/run/mysqld/mysqld.pid
# group database
# start program = "/usr/sbin/mysqld start"
# stop program = "/usr/sbin/mysqld stop"
# if failed host 127.0.0.1 port 3306 then restart
# if 5 restarts within 5 cycles then timeoutcheck process apache with pidfile /var/run/httpd.pid
group www
start program = "/etc/init.d/httpd start"
stop program = "/etc/init.d/httpd stop"
if failed host {myexample.site.com} port 80 protocol http
and request "/monit/token" then restart
if cpu is greater than 60% for 2 cycles then alert
if cpu > 80% for 5 cycles then restart
# if totalmem > 500 MB for 5 cycles then restart
if children > 250 then restart
if loadavg(5min) greater than 10 for 8 cycles then stop
if 3 restarts within 5 cycles then timeout# check process postfix with pidfile /var/spool/postfix/pid/master.pid
# group mail
# start program = "/etc/init.d/postfix start"
# stop program = "/etc/init.d/postfix stop"
# if failed port 25 protocol smtp then restart
# if 5 restarts within 5 cycles then timeoutAs stated, I'm at a dead-end. I have no idea what to try next, as I've tried everything that
I could see from a variety of other trouble posts, but always end up with a dead service
after 60 seconds.Help appreciated. = )
- Keith
--
To unsubscribe:
http://lists.nongnu.org/mailman/listinfo/monit-general
[Prev in Thread] | Current Thread | [Next in Thread] |