[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [monit] MONIT_DESCRIPTION returning old description not new one

From: Martin Pala
Subject: Re: [monit] MONIT_DESCRIPTION returning old description not new one
Date: Wed, 03 Dec 2008 20:27:45 +0100
User-agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv: Gecko/20081030 Iceape/1.1.13 (Debian-1.1.13-1)

The fix should be relatively easy ... the problem is, that the MONIT_DESCRIPTION returns currently error described by first s->eventlist list member.

This is wrong ... the eventlist is list of all events which occurred and is used as state history. The eventlist provides information about frequency of given error and supports the logic which changes state only when given error rate is matched. Some of these events may thus be inactive, some of them active.

In order to get correct information the handle_action() in event.c should call the control_service() with event as additional argument. It will allow to get the correct description of the error which is set in set_monit_environment(). See event.c for how event engine works and control.c + spawn.c for program execution.

We're very busy now, so if you want to give it a stab it will help, patches are welcomed :)


index one wrote:
Thanks for the info,
Anything that I can do to help / check?
Is it an easy fix / likely to be fixed soon?
Maybe if you could point me in the right direction I could look at
fixing myself / getting someone to fix it and submit a patch.

Thanks for your help,

2008/12/3 Martin Pala <address@hidden>:
Thanks for report, yes the MONIT_DESCRIPTION shows incorrect information
(points to first event only) ... we'll fix.


On Nov 30, 2008, at 3:43 AM, index one wrote:

I currently have a problem with monit failing to return the correct
MONIT_DESCRIPTION on one log file.
I monitor about 6 log files and one of them on occasion returns an old
message on a regex match and not the current message.
When this happens it will continue sending the old message on every
regex match for that file until monit is restarted.
Currently I am only seeing this on one log file (heartbeat, ha-log).
I have tried monit-5.0_beta4 and a patched version of monit-4.10.1
that allows  MONIT_DESCRIPTION in exec
Alert emails contain the correct $DESCRIPTION tag but the
$MONIT_DESCRIPTION passed to exec is incorrect.
the script sends the message as a snmp trap bit just
echoing $MONIT_DESCRIPTION to a temp file has the same result

-------------------------- test----------
echo "ERROR" >> /var/logs/ha-log
echo "WARN" >> /var/logs/ha-log
content match [ERROR]
content match [ERROR]

OS is Solaris 10
Below is my monitrc (monit is started from init)

Any help appreciated

set init
set alert address@hidden #for testing
set daemon 1
set logfile /var/log/monit
set httpd port 2812
allow admin:monit

check file raid-log with path /opt/StorMan/RaidEvtA.log
ignore match /etc/opt/raid.nomatch
if match /etc/opt/raid.match then
exec "/usr/bin/ raid"

check file Apache-log with path /var/log/apache/error.log
ignore match /etc/opt/Apache.nomatch
if match /etc/opt/Apache.match then
exec "/usr/bin/ Apache-log"

check file Mysql-log with path /var/log/mysql/testserv1.err
ignore match /etc/opt/Mysql.nomatch
if match /etc/opt/Mysql.match then
exec "/usr/bin/ Mysql-log"

check file Mysql-Replication-log with path /var/log/mysql/testserv1.err
ignore match /etc/opt/Mysql-Replication.nomatch
if match /etc/opt/Mysql-Replication.match then
exec "/usr/bin/ Mysql-Replication-log"

check file ha-log with path /var/logs/ha-log
ignore match /etc/opt/ha.nomatch
if match /etc/opt/ha.match then
exec "/usr/bin/ ha-log"

check file system-log with path /var/adm/messages
ignore match /etc/opt/system.nomatch
if match /etc/opt/system.match then
exec "/usr/bin/ system"

To unsubscribe:

To unsubscribe:

To unsubscribe:

reply via email to

[Prev in Thread] Current Thread [Next in Thread]