[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Performing an action only after X failures
From: |
Martin Pala |
Subject: |
Re: Performing an action only after X failures |
Date: |
Sun, 26 Dec 2004 21:26:52 +0100 |
User-agent: |
Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7.3) Gecko/20041007 Debian/1.7.3-5 |
Hello,
in the case that you are changeing configuration and restarting the
process by hand you should:
1.) either disable given service monitoring in monit before you will
restart it by hand - thereafter you can enable monitoring again:
# monit unmonitor myservice
# /etc/init.d/myservice restart
# monit monitor myservice
2.) or restart the service using monit method (no need to unmonitor):
# monit restart myservice
If you don't do so, you risk race condition (monit can try to start the
service during your manual restart). This behavior is common for all
process monitors (such as Sun Cluster, etc.) - otherwise the process
monitor can't identify whether the service was stopped purposely or that
it failed by accident. In this case your requested feature is not good
solution.
However i agree that in other cases the possibility to trigger chosen
action as soon as the service will reach some error ratio is good. This
may allow to divide action rules based on error level. We have currently
support just for timeout, using:
# if 2 restarts within 3 cycles then timeout
In the case that it will be general, it can allow to stack the rules and
provide error level dependant actions, such as for example:
# if <X> <EVENT> within <Y> cycles then <ACTION>
where:
... <X> = number of event occurences
... <EVENT> = event type
... <Y> = number of consequent cycles
... <ACTION> = given action (alert|restart|unmonitor|exec|...)
I'm +1 to add such feature. What do developers and users think about it?
Martin
Kaspar Landsberg wrote:
Hello,
I'd like monit to perform a given action for a given service only after
the service has failed for a given number of checks/cycles/minutes.
Example: Let's suppose I've got some daemon whose configuration I change.
But I make a mistake while changing the config and when I try to restart
the daemon, I get an error message, the daemon refuses to restart and for
a while there's no daemon running. It takes me 2 minutes to fix the error
in the conf file and to correctly restart the daemon. But if there was a
monit running at the same time with a low checking cycle, then the
predefined action for that daemon would be triggered.
I want to avoid such szenario by telling monit to only trigger a given
action if the service/daemon in question fails for X cycles/minutes.
Is this already possible? If not, might that feature get added in the
near future?
Thanks,
Kaspar
PS: Looked at the archive but didn't find anything related to my
question.
- Re: Performing an action only after X failures,
Martin Pala <=