Re: [monit] Re: monit race conditions on Mac OS X 10.5 Leopard?

monit-general

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [monit] Re: monit race conditions on Mac OS X 10.5 Leopard?

From:	Martin Pala
Subject:	Re: [monit] Re: monit race conditions on Mac OS X 10.5 Leopard?
Date:	Thu, 24 Jan 2008 21:17:58 -0800

This seems strange. Monit alerts are generated on each actionaccording to the configuration.


Can you run monit in verbose mode (-v option)  and send the log?

It is possible that the mailserver rejected the messages or you haveset the alert filter in the monit configuration to suppress particularalerts?

By default monit will drop the email notification on mailserver error.There is also support for events queue which allows to retry themessage delivery next cycle - to enable it us:


--8<--
 set eventqueue

basedir /var/monit # set the base directory where events willbe stored

     slots 100           # optionaly limit the queue size
--8<--

Anyway - the verbose mode will reveal what happens with the alertmessages and whether event queue is needed because of mailserverproblems.



Thanks,
Martin



On Jan 19, 2008, at 12:06 PM, Sergio Trejo wrote:

This is an update to my previous message posted herein. The version4.10.1 of monit most definitely has a bug in it and its not relatedto Mac OS X 10.5 because version 4.9 of monit runs just perfectly onMac OS X 10.5.
The bug is that monit 4.10.1 does not send out multiple emailmessages when, very cycle, it encounters multiple daemons notrunning (whether the daemons have crashed or have been torn downintentionally by a sys admin).
Regards,

Sergio

On 1/19/08, Sergio Trejo <address@hidden> wrote: Hello,
I have monit (version 4.10.1) running on an Apple machine which isMac OS X Server (Leopard, 10.5.1). My installation of monit monitorssix separate daemons for these programs: Apache, Postfix,PostgreSQL, Tomcat, OpenLDAP, and MySQL. My monit configuration filehas entries that look like this for all of the six aforementionedprograms (taking Apache for example):
check process apache with pidfile "/opt/local/apache2/logs/httpd.pid" every 10 cycles
    start = "/opt/local/apache2/bin/apachectl start"
    stop = "/opt/local/apache2/bin/apachectl stop"
    if failed port 80 and protocol http then restart
    if 5 restarts within 5 cycles then timeout

Where my daemon frequency is set to 60 seconds as in:

set daemon 60
What is interesting is that I had all six of my daemons running as astarting point and monit confirmed this (using the little httpserver built into monit on port 2812). I then, very intentionally(as sort of an auditing process) killed five out of my six daemons(the only daemon I left running was the Postfix daemon because Istill wanted to have monit be capable of sending email alerts sinceI use the internal mail server running on the same machine asPostfix, as in "set mailserver 127.0.0.1"). So, with five of the sixdaemons intentionally killed, monit did successfully later catch upand successfully re-started all five daemons. However, monit onlygenerated two mail message alerts:1
1. A message stating that the apache daemon did not exist
2. A message stating that the postgres daemon did exist (seemed tohave sent this message after re-starting PostgreSQL)
But, why didn't I receive ten messages, five of them for each daemonthat I intentionally killed stating that they did not exist, andthen later on five more messages stating that the five daemons(after being restarted) did indeed exist again?
Also, why did I get the first message for apache saying it didn'texist whereas the second message, should it also have stated thatthe apache daemon existed again (instead of telling me that thepostgres daemon existed)?
It doesn't make sense. Is it possible that monit was "overwhelmed"or overloaded in some way and became "confused"? I know that doesn'tsound appropriate for a binary system but there is nothing in themonit log file to give me any hints. Perhaps, did monit experience arace condition?
The log file shows that all five daemons which I had manually killedwere restarted successfully (and indeed they were -- I ssh'ed intomy server and saw them all running again as processes and monit alsoreported their successful running again on its http server on port2812).
If this was a race condition, could there be an issue withthreading? Mac OS X 10.5 (Leopard and Leopard Server) might bedifferent enough compared to previous versions of Mac OS X withregard to a change to how threading works (but I am writing thisvery vaguely without much information at the moment other than somefuzzy recollection that something related to threading on Leopardmight have changed).
Thanks for any suggestions,

Serg

--
To unsubscribe:
http://lists.nongnu.org/mailman/listinfo/monit-general

[Prev in Thread]

Current Thread

[Next in Thread]

[monit] monit race conditions on Mac OS X 10.5 Leopard?, Sergio Trejo, 2008/01/19
- [monit] Re: monit race conditions on Mac OS X 10.5 Leopard?, Sergio Trejo, 2008/01/19
  - Re: [monit] Re: monit race conditions on Mac OS X 10.5 Leopard?, Jan-Henrik Haukeland, 2008/01/19
    - Re: [monit] Re: monit race conditions on Mac OS X 10.5 Leopard?, Sergio Trejo, 2008/01/19
  - Re: [monit] Re: monit race conditions on Mac OS X 10.5 Leopard?, Martin Pala <=
    - Re: [monit] Re: monit race conditions on Mac OS X 10.5 Leopard?, Sergio Trejo, 2008/01/27

Prev by Date: Re: [monit] Group handling not working
Next by Date: Re: [monit] Re: monit race conditions on Mac OS X 10.5 Leopard?
Previous by thread: Re: [monit] Re: monit race conditions on Mac OS X 10.5 Leopard?
Next by thread: Re: [monit] Re: monit race conditions on Mac OS X 10.5 Leopard?
Index(es):
- Date
- Thread