[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Bugs on status notifications

From: Jan-Henrik Haukeland
Subject: Re: Bugs on status notifications
Date: Fri, 25 Nov 2005 23:28:30 +0100

On 22. nov. 2005, at 19.24, Don Parker wrote:

Using 4.6:

I have 4 processes I am monitoring, and each has a rule similar to:

        If failed port 1234 with timeout 15 seconds for 3 times
within 5 cycles then restart

The rules are identical for the 4 processes except for the ports used.

When I look at them through the web server I see that 2 of
them have augmented the rule I placed in the rc file with "
else if passed 1 times within 1 cycle(s) then alert " which
is creating a lot of records in my log I do not want to see.

Do you run with the -v debug option? If so, please turn this off, it will flood the log file and should only be used when you really need to debug.

I have no idea where this came from - it is not in my rc file.

Ah yes, maybe it is better for Martin to chip in here since he is responsible for most of the event system and can give a better explanation of the rationale behind the system. The thing is, AFAIK, the else clausal is automatically added to every if-test if none was specified. This is done so if a test fails you get an alert both when the service is going down _and_ when it comes back again. The idea is that this information can be useful, lets say a service goes down in the middle of the night and you get an SMS from monit. Right after, monit will send you another SMS (if so configured) that the service is back online again, if it managed to fix the problem. That way, you can go back to sleep again.

The automatically up notification is also very useful when we finally get m/monit released, since m/monit collects both up and down alerts and can display them in a nice statistical diagram which can be useful for historical and SLA reasons.

I have set up logging to a file rather than to use syslog,
and I launch Monit through "integration with init".  I see
all entries being sent to the log also being echoed on tty1.
Not a big deal, but not expected either.

This is a "side-effect" since we want to have a tty connected to monit for debug purposes when run in non-daemon mode. To see what I mean, try to run monit from the console like so; "monit -v validate" which will run monit once and print out lots of interesting debug info to the console. Anyway to turn this off when run from init, do the standard file descriptor redirect like so, "monit -I 2>&1 1>/dev/ null"

This is worse that I thought. I tried to turn off the alerts by adding
my own "else if passed then exec <something>" clause. While my exec does
get executed I still see the alerts on passed tests.

Every if-test raise an alert also if you have specified another action in the "then" clausal, such as an exec. To turn off alerts simple remove the "set alert.." statement from the monitrc file and optionally only add the "alert .." statement to the services you want to raise alerts. Please see the manual for more information.

Q: Is there any relationships between timeouts and poll intervals? For
example, if my monitrc file has "set daemon 15" and I have a port check
rule with a timeout of 30 seconds, does my port check rule really wait
30 seconds or does monit see it failing every 15 seconds?

Monit runs sequentially in a "validate->sleep->validate->sleep.." pattern, where "validate" is the testing of all services mentioned in the monitrc file. If you have a sleep 30 sec in a port check monit really waits up to 30 seconds before timeout. This, of course means that a poll-cycle time really is, (sleep-time + validate-time).

The good thing about open-source is that you can read the code and see how monit really works and even send us patches if you want to fix something :)

Jan-Henrik Haukeland
Mobil +47 97141255

reply via email to

[Prev in Thread] Current Thread [Next in Thread]