[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: if x restarts within y cycles then exec "script"

From: Martin Pala
Subject: Re: if x restarts within y cycles then exec "script"
Date: Tue, 20 Feb 2007 17:31:32 +0100
User-agent: Thunderbird (Windows/20061207)

Javi Roman wrote:
I've tested the patch timeout_to_exec.patch and it worked fine. I
think it's would be a good Monit improvement.

It can be added in the future (thanks to Alec for patch) ... it is needed to check the implementation regarding the event handler integration.

Nevertheless I would
like if it's possible to do something similar to:
if 3 restarts within 6 cycles then exec "/sbin/reboot"
with the current Monit version.

There are several ways, for example you can modify the start script (used in "start program ..." statement) to audit the number of consecutive restarts and if it reaches the given ratio then perform the reboot using this script.

Another possibility is to just touch some state file from the start script and use the timestamp test as described in FAQ question no. 13. You can then check the timestamp for example this way:

  check file reboot_trigger with path /tmp/restart_flag
    if timestamp < 10 seconds for 6 cycles then exec "/sbin/reboot"

When the start script is executed, it touches the /tmp/restart_flag => its timestamp is updated. Monit watches the timestamp and in the case that it is updated 6 times within 6 cycles then it execs reboot. When the start script succeeded, the timestamp will be updated just once and will become older then 10 seconds => the timestamp test won't match (note that the timestamp value depends on your monit cycle length - for example in the case that monit poll cycles is 5s, then timestamp of 10s should be fine.

Related FAQ excerpt:

13. Q: Is here any support for external testing scripts available?

A: We plan to add the support for external scripts in the future (see our
       TODO list - Until
       native support will be available, here are some workarounds:

       1.) nice workaround contributed by Pavel Urban is based on timestamp
monitoring of file, which is updated by external script, running from cron. When everything is OK, the script will update (touch) the file.
       When the state is false, the script won't update the timestamp and
       monit will perform the related action.

       For example script for monitoring the count of files inside /tmp
       if [ `ls -1 /tmp |wc -l` -lt 100 ]
         touch /var/tmp/monit_flag_tmp

       run this script via cron (for example, every 20 minutes):
        20 * * * * /root/test_tmp_files > /dev/null 2>&1

and do timestamp check on /var/tmp/monit_flag_tmp (or any file you decide)
       in monit control file:
        check file monit_flag_tmp with path /var/tmp/monit_flag_tmp
          if timestamp > 25 minutes then alert

       Done :)

       Another Example script: for monitoring the Solaris Volume Manager
       /usr/sbin/metastat | /usr/xpg4/bin/grep -q maintenance
       if [ $? -ne 0 ]; then
         touch /var/tmp/monit_flag_svm

2.) alternatively you can use the monit's file content testing to watch
       logfiles or status files created similar way as described above.

       Example script:
       /usr/sbin/metastat > /var/tmp/monit_svm

       and example monit syntax:
       check file svm with path /var/tmp/monit_svm
         if match "maintenance" then alert


reply via email to

[Prev in Thread] Current Thread [Next in Thread]