Hello:
I have a number of high-CPU processes that run on 24-core boxes configured e.g.:
check process emr-enc01-01 with pidfile /var/run/tada_liveenc_emr-enc01-01.pid
start program = "/usr/local/tada/launch.sh -c emr-enc01-01"
stop program = "/bin/bash -c 'kill -s SIGTERM `/bin/cat /var/run/tada_liveenc_emr-enc01-01.pid`'"
if totalmem > 80% then alert
if totalmem > 90% then restart
if totalcpu < 10% for 10 cycles then alert
These processes create pidfiles which match correctly in top as:
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
1710 root 20 0 3064m 1.2g 7808 S 578 15.8 47:31.53 tada_liveenc
1866 root 20 0 2954m 1.3g 7804 S 545 16.7 45:18.52 tada_liveenc
However, monit sees these as a completely different total CPU usage:
Process 'emr-enc01-01'
status Running
monitoring status Monitored
pid 1710
parent pid 1
uptime 8m
children 0
memory kilobytes 1372300
memory kilobytes total 1372300
memory percent 16.7%
memory percent total 16.7%
cpu percent 4.1%
cpu percent total 4.1%
data collected Thu, 05 Jan 2012 00:05:49
Process 'emr-enc01-02'
status Running
monitoring status Monitored
pid 1866
parent pid 1
uptime 8m
children 0
memory kilobytes 1362240
memory kilobytes total 1362240
memory percent 16.6%
memory percent total 16.6%
cpu percent 4.1%
cpu percent total 4.1%
data collected Thu, 05 Jan 2012 00:05:49
Any thoughts on why this might be happening? Hosts are ubuntu natty. The master processes themselves spawn about 150 threads (not forks).
FYI:
662 address@hidden: $ uname -m
x86_64
663 address@hidden: $ file `which monit`
/usr/local/bin/monit: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), dynamically linked (uses shared libs), for GNU/Linux 2.6.0, not stripped
664 address@hidden: $ monit -V
This is Monit version 5.3.2
Copyright (C) 2000-2011 Tildeslash Ltd. All Rights Reserved.
Thanks in advance,
-Tom