monit-general
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Force monitoring of service?


From: Adam Nielsen
Subject: Re: Force monitoring of service?
Date: Thu, 20 Jan 2011 17:03:38 +1000
User-agent: Thunderbird 2.0.0.23 (X11/20091130)

Monit cannot monitor a dead service, so there's no forcing that issue. That's like trying to force a medical heartbeat monitor to continue to give you the heart rate and pulse of a dead patient.

That's true, but unlike a heartbeat monitor monit has the ability of doing something about the issue, in this case attempting to restart the service!

Monit MUST at some point give up trying to re-start a service that refuses. That condition is an indication that something is VERY wrong with either the service core, it's dependencies, or some other environmental condition that Monit cannot be aware of, or have any control over. You can have Monit monitor these other
items, and you might find the problem, or get a hint as to what it might be.

In this case, due to extended power outages resulting from recent flooding, (I suspect) DNS became unavailable for approximately a day, which caused both MySQL and Exim to terminate and then refuse to start during the outage.

After five failed restart attempts of both these services monit gave up (which is fine, I had told it to do so.) I have since removed that condition, because in this case if monit had continued to attempt restarts, a day later when DNS service returned the services would have started up again normally.

However after having removed the timeout line from the monit config file and restarting, monit has not automatically restarted the services, which is something I expected it should do. I believe this is because it thinks the services have been stopped manually, so I am looking for a way of telling it that no, the service really died and you should restart it if it's not already running when monit first loads.

If this is not possible then there will always be a tiny bit of doubt in my mind, that should monit die at the exact same moment one of my important services does, then when monit is restarted (e.g. via init) it will decide not to restart the crashed service.

In short, the service you are monitoring should be healthy to begin with. Monit is there for the freak or occasional service crash. It's not designed to trouble shoot anything, nor 'force' anything.

I disagree. I tend to use it to try a bunch of different remedial actions when things fail and it works really well - it's nice knowing that you can leave your server alone through various outages and it will just fix itself ;-)

Cheers,
Adam.




reply via email to

[Prev in Thread] Current Thread [Next in Thread]