[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Monit switches to "not monitored" state occasionally

From: Christopher Johnston
Subject: Re: Monit switches to "not monitored" state occasionally
Date: Fri, 13 Jan 2012 09:11:33 -0500


I actually see this happen a lot as well on on my systems where we restart a large number of apps on a daily code drop (sometimes 100s of systems X 6 apps per box).  Some apps will go to an unmonitored state yet the application is still up and running and the pid file has a matching pid.  The only way I have been able to resolve is to restart monit all together and manually monitor the app again. Causes a lot of grief with my ops guys.  

Here is another error string I also saw the other night where the pid magically changed from 507 to 0, only way to resolve has been to fully restart monit with the same procesure as above. 

I am using monit verison 5.2.5.

<27> Jan 11 17:55:15.547617 -05:00 prod005 monit[5484]: 'WEB01' process PPID changed from 507 to 0


On Fri, Jan 13, 2012 at 9:01 AM, Martin Pala <address@hidden> wrote:

On Jan 13, 2012, at 2:45 PM, Johannes Bauer wrote:

> Hi Martin,
> On 13.01.2012 14:16, Martin Pala wrote:
>> you should check the monit logs - it will show why the service monitoring was disabled (whether it was some manual action, etc.).
> Well, monit is configured to log to syslog:
> set logfile syslog facility log_daemon
> And I can see that there are messages when monit starts, that the
> control file syntax is okay, but that's it. There's no indication
> whatsoever why the processes are in the unmonitored state -- this is
> actually why I'm asking: because the logs do not show anything out of
> the ordinary yet monit put all processes in the "unmonitored" state.
> Is there any automatic action which would cause monit to put a monitored
> child into "unmonitored" autonomically? If so, how can this mechanism be
> disabled?

There are two possible ways how the service can get unmonitored automatically:

1.) when the "if <x> restarts within <y> cycles then timeout" statement is used, the monit will unmonitor the service if this condition matches

2.) when you use dependency ("depends on <service>") and the parent service is stopped/unmonitored (aither via the timeout statement or manually by admin) - then the stop/unmonitor action cascades to the child services too.

Also Monit <= 5.2.5 *temporarily* displayed "Not monitored" while the service restart was pending - the monitoring state returned back to "Monitored" when the restart finished … this was fixed in Monit 5.3 as it was confusing and it displayes "Monitored" during restart too.

If none of the above cases matches your configuration, the most probable cause is, that somebody manually unmonitored/stopped the service via Monit.


reply via email to

[Prev in Thread] Current Thread [Next in Thread]