[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Monit believes process failed when it didn't

From: Eric Montellese
Subject: Re: Monit believes process failed when it didn't
Date: Thu, 3 Dec 2020 16:46:25 -0500

Got an odd one for ya...

I have a (legacy) shell script that I need to call from monit.  This shell script runs an infinite loop.  The platform is a busybox-based openwrt platform (so, the script is running 'ash').  

On this platform, it appears that the timing of background processes is not quite as expected.  I'd like to understand the expected methodology.  The method from would seem to be foolproof to avoid the issue I'm seeing (below).  However, this method fails outright (see below).

I'm currently running Monit version 5.26.0

The monit config is pretty simple:

check process myprocess with pidfile /tmp/
     start program = "/etc/monit.rc/myprocess.init start"
     stop program  = "/etc/monit.rc/myprocess.init stop"
     depends on other_process

myprocess.init is also quite simple (just showing the 'start' method).  Here are three different things I've tried:

1.  Following the example in the monit docs:
start() {
    echo $$ > /tmp/
    exec /usr/bin/

In this case, monit says that the "process never returned" and tries to restart it.  Of course the process didn't return, so why is this the documented method?  Is this a difference in versions of monit (vs the documentation I'm using)?

2. Jam that sucker into the background
start() {
    /usr/bin/ &
    echo $! > /tmp/

Surprisingly, this also does not work.  In this case, the pid file is created as expected, but monit does *not* think that the process is running.

3. Try something silly?
start() {
    /usr/bin/ &
    echo $! > /tmp/
    sleep 1

Adding a 'sleep' fixes the issue... but why?

For debug, instead of the 'sleep' I've also tried putting 
'ps | grep myprocess > /tmp/output'

In this case, I *do* see the process listed in the /tmp/output file -- but in this case, monit also returns happily. (So it's a heisenbug)

1.  What is the "normal" way to do this?
2.  Anyone seen this sort of behavior on an embedded system?

Best Regards,

reply via email to

[Prev in Thread] Current Thread [Next in Thread]