monit-general
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [monit] Alerts not being triggered


From: Martin Pala
Subject: Re: [monit] Alerts not being triggered
Date: Tue, 20 Jan 2009 21:34:49 +0100

Thanks for info.

Monit logs error when the mailserver fails or returns error class >= 400 ... what exactly was the problem in your case? (we can improve the error reporting) Since no error was logged by monit it seems that the message was accepted by mailserver and the MTA dropped the massage later?

Thanks,
Martin


On Jan 20, 2009, at 7:30 PM, Bruce Reed wrote:

The strace uncovered my problem and it was with a mail alias, so thanks for
the tip!

It would be nice to have more verbose logging by monit to log email events. Had I seen those in the log I would have known it at least sent the message and the problem was with the address I had used. On the other hand, I should have combed my mailserver logs to see if a message had been received for the
address I specified.

Bruce


On 1/16/09 12:14 PM, "Martin Pala" <address@hidden> wrote:

Looks strange - i don't remember problem like this and even changelog
doesn't mention such issue.

It could be good to trace monit to see what happened:

strace -f -o monit.trace monit -vI


The monit.trace file will contain system call traces so we can see
whether it tried to connect to SMTP server and what happened.




On Jan 16, 2009, at 9:05 PM, Bruce Reed wrote:

4.9 rpm from rpmforge


On 1/16/09 11:55 AM, "Martin Pala" <address@hidden> wrote:

The configuration looks OK.

What monit version it is?

Thanks,
Martin


On Jan 16, 2009, at 8:22 PM, Bruce Reed wrote:

Here is the verbose output. Looks like verbose output begins and
ends at
process start up (host/domain names changed):

Starting Process Monitor (monit): monit: Debug: Adding host allow
'localhost'
monit: Debug: Skipping redundant host 'localhost'
monit: Debug: Skipping redundant host 'localhost'
monit: Debug: Adding credentials for user 'admin'.
Runtime constants:
Control file       = /etc/monit.conf
Log file           = syslog
Pid file           = /var/run/monit.pid
Debug              = True
Log                = True
Use syslog         = True
Is Daemon          = True
Use process engine = True
Poll time          = 60 seconds
Mail server(s)     = prodsmtp.mydomain.net
Mail from          = address@hidden
Mail subject       = monit alert --  $EVENT $SERVICE
Mail message       = $EVENT Service $SERV..(truncated)
Start monit httpd  = True
httpd bind address = localhost
httpd portnumber   = 2812
httpd signature    = True
Use ssl encryption = False
httpd auth. style  = Basic Authentication and Host/Net allow list
Alert mail to      = address@hidden
Alert on         = All events

The service list contains the following entries:

Process Name          = ntpd
Pid file             = /var/run/ntpd.pid
Monitoring mode      = active
Start program        = '/etc/init.d/ntpd start' timeout 1 cycle(s)
Stop program         = '/etc/init.d/ntpd stop' timeout 1 cycle(s)
Pid                  = if changed 1 times within 1 cycle(s) then
alert
Ppid                 = if changed 1 times within 1 cycle(s) then
alert
Timeout              = If 3 restart within 3 cycles then unmonitor
else if
passed then alert

System Name           = test-prod.mydomain.net
Monitoring mode      = active


---------------------------------------------------------------------------> >>>
-
---
monit: pidfile '/var/run/monit.pid' does not exist
Starting monit daemon with http interface at [localhost:2812]


Then when ntp is killed I see the following in /var/log/messages:

Jan 16 19:10:50 test-prod ntpd[13505]: ntpd exiting on signal 15
Jan 16 19:11:32 test-prod monit[2398]: 'ntpd' process is not running
Jan 16 19:11:32 test-prod monit[2398]: 'ntpd' trying to restart
Jan 16 19:11:32 test-prod monit[2398]: 'ntpd' start: /etc/init.d/
ntpd
Jan 16 19:11:32 test-prod ntpd[2541]: ntpd address@hidden Tue
Jun 10
00:07:18 UTC 2008 (1)
Jan 16 19:11:32 test-prod ntpd[2542]: precision = 2.000 usec
.
.

There is no additional output from monit and no attempt to send mail
according to maillog.

On 1/16/09 3:52 AM, "Jan-Henrik Haukeland" <address@hidden>
wrote:

Have you tried to specify which mail server Monit should use for
alerts?

See
http://mmonit.com/monit/documentation/monit.html#setting_a_mail_server_for
_a
le
rt_messages



On 16. jan.. 2009, at 08.00, Bruce Reed wrote:

I’ve just begun using monit and I am having difficulties getting
monit to send mail. I’m testing using ntpd and it is restarting
the
process, but not sending mail on service restart events or
timeout.
In monit.conf I have:

set alert address@hidden

I then had a check statement like this:

check process ntpd with pidfile /var/run/ntpd.pid
 start program = "/etc/init.d/ntpd start"
 stop program  = "/etc/init.d/ntpd stop"
 if 3 restarts within 3 cycles then timeout
 alert address@hidden only on { timeout }

After 3 successive kills of ntpd and restarts by monit, a timeout
message was logged, but no mail was sent. I tried removing the
alert
statement to see if mail would be sent on any event, but I only
see
information iogged and no mail is sent. Nothing in /var/log/
maillog
either.

Funny thing is, when I first set this up monit attempted to send
mail, but an ACL on my postifx server prevented it from getting
through. I fixed that and retried my test, but from that point
on no
mail was sent. Thought perhaps this was a state caching issue, but
no change across monit restart and I installed monit on another
server using the same conf files and I get the same results there.

Thanks,
Bruce



--
To unsubscribe:
http://lists.nongnu.org/mailman/listinfo/monit-general



--
To unsubscribe:
http://lists.nongnu.org/mailman/listinfo/monit-general



--
To unsubscribe:
http://lists.nongnu.org/mailman/listinfo/monit-general



--
To unsubscribe:
http://lists.nongnu.org/mailman/listinfo/monit-general



--
To unsubscribe:
http://lists.nongnu.org/mailman/listinfo/monit-general



--
To unsubscribe:
http://lists.nongnu.org/mailman/listinfo/monit-general





reply via email to

[Prev in Thread] Current Thread [Next in Thread]