From: Martin Pala
Subject: Re: problem with cpu usage (user)
Date: Mon, 05 Feb 2007 23:18:40 +0100
I was not able to replicate the problem. It seems that there is something special in your environment.

Can you please describe your environment?:

- what HW do you use?

- what linux version?

- what kernel version?

- what monit version - is it original monit release or part of some linux distribution?

- where was monit compiled? (on the same machine where you can see the problems or different system?)

- your monit configuration

It could help if you can test following:

1.) try to compile monit 4.8.2 on your server

2.) send the output of './configure' and 'make' from the monit compilation

3.) start monit in one terminal using 'date && ./monit -vIc /path/to/the/monitrc'

4.) run 'date && vmstat 5' in another terminal

5.) run 'date && while true; do cat /proc/stat; sleep 5; done' in another terminal

6.) run 'date && dd if=/dev/zero of=/dev/null' to generate some load (skip this step if there is natural load in your environment)

7.) when the problem starts, wait few minutes, terminate the monit, vmstat, /proc/stat loop and dd with ^C and send output from monit, vmstat and /proc/stat



Aleksander wrote:
Matt Corks wrote:
Greetings, all.  I'm having a problem with monit 4.8.2 on gentoo 1.4.16
(Linux kernel 2.6.10-gentoo-r6).  According to top all CPUs are mostly
idle, but monit thinks the cpu user usage is hovering over 70%.  Having
mpstat average CPU usage over the same cycle length as monit (2
minutes) results in the same values as top.


I'm too still having this issue. It popped up again on the weekend. Saturday night I started getting SMS messages about issues with cpu user and cpu wait. As I was away from civilization, I was only able to disable system monitoring at half past eight yesterday. 134 messages, I'm quite certain they're fake.

Now I enabled monit system monitoring and the problem still there. I look at top and there's nothing, monit talks about 91.x% cpu user usage. So I made monit to alert only on "3 times per 3 cycles", same problem. Looking at top, the machine is mostly idle. Only occasionally goes cpu user over 30 and that's when both CPU's are combined.

There's definately a bug. But it appeared only the night before on that box, it had a monit uptime of something like 60 or 80 days. I reloaded monit and still have this issue (uptime is reset).

Did Matt's /proc/stat output help resolve the issue?


