Re: Request for a new "script" service type

monit-general

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Request for a new "script" service type

From:	Michel Marti
Subject:	Re: Request for a new "script" service type
Date:	Wed, 22 Dec 2004 10:14:18 +0100
User-agent:	Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7.3) Gecko/20041007 Debian/1.7.3-5

Martin Pala wrote:

1.) the example which you showed is possible to integrate with monitalready using existing file timestamp test as mediator: your script canbe run from cron in regular intervals (for example each 5 minutes) andin the case that everything is ok, it could touch some file (for example"/tmp/check_myservice.ok"). This will update its timestamp, which canmonit test this way:


There are several problems with this:

1. I don't (yet) have cron on this box (its an arm-based embedded device withlimited amount of storage and RAM). I could however install cron to "fix" this.2. My monit interval is set to 30 seconds but the smallest interval in cron isone minute3. My embedded device has no battery buffered clock, this means that on bootup,the clock will be set to start of epoch (1970), but later will be synchronizedusing ntp. This might trigger a unnecessary restart of the service because monitthinks that the file has not been touched within the specified time.4. Monitoring will be split across two systems (cron/monit). This might not beobvious for users looking at the cron-tab or monit configuration only. Ofcourse, this can be fixed by adding documentation to monitrc/crontab.

> On monit side it should be possible to set at least timeout for method (there> could be some default value, such as 5 seconds).Agreed. And monit might also pass some information to the script usingenvironment variables (e.g. MONIT_SERVICE=<service name>, etc.).

I'm not sure whether it is good to define new 'script' object. I thinkit could be sufficient to support the generic testing method interfacein all existing objects (i.e. 'process', 'device', 'host', 'file','directory'). Example syntax:
check device rootfs with path /
  if failed script "/sbin/check_lvm rootvol" with timeout 7s then alert
  if space usage > 90% then alert
  ...
---

I think this would be enough for most cases, but introduces some overhead iftrying to monitor some aspects of the system that are not covered by monit atall. E.g. if I want to send an alert if the number of establishedTCP-connections exceed a certain limit I would have to do something like this:


check file tcp-connections with path /dev/null
   if failed script "/sbin/check_connections --max=1000" with timeout 5s then 
alert

The method will return appropriate event type in the case offailure/passed state and event decription and monit will handle thedefined action. The timeout serves as safety for the case that themethod will be jammed.

OK, but I suggest that returning the event type and description should beoptional. If the script does not return this information, monit should assumethe (new) event type "script failed". To determine the general failure/successof the script, monit should IMO look at the scripts exit code.



Michel

[Prev in Thread]

Current Thread

[Next in Thread]

Request for a new "script" service type, Michel Marti, 2004/12/21
- Re: Request for a new "script" service type, Martin Pala, 2004/12/21
  - Re: Request for a new "script" service type, Michel Marti <=
    - Re: Request for a new "script" service type, Martin Pala, 2004/12/22
    - Re: Request for a new "script" service type, Michel Marti, 2004/12/22
    - [PATCH] total number of service (re)starts, Michel Marti, 2004/12/22

Prev by Date: Re: Request for a new "script" service type
Next by Date: Re: Request for a new "script" service type
Previous by thread: Re: Request for a new "script" service type
Next by thread: Re: Request for a new "script" service type
Index(es):
- Date
- Thread