Agents Constantly Reporting Start Events

From: Christopher Opena
Subject: Agents Constantly Reporting Start Events
Date: Sun, 8 May 2011 14:04:38 -0700


We recently (last night) upgraded all of our Monit agents to 5.25 and our M/Monit server to 2.3.4.  For a few hours everything appeared to be working fine, if a little on the slow side.  After a little while, however, we started getting flooded with "Monit Started" event messages from every server.  Checking the logs for a sample server, I see:

[UTC May  8 20:51:11] error    : M/Monit: communication failed (event message)
[UTC May  8 20:51:11] error    : M/Monit handler failed, retry scheduled for next cycle
[UTC May  8 20:52:17] error    : M/Monit: error receiving data from http://{our_monit_url}:{our_monit_port}/collector -- 

This is repeated numerous times (approximately every 30 seconds to 1 minute).  At first I worried that our M/Monit server may be getting overloaded, but looking into the M/Monit server I see that our load is only at about 1.2, mem usage is less than 50%, and cpu usage is at about 0.2% (user and sys combined).

Looking at the M/Monit logs, I only see a few of these:

2011-05-08 20:59:37 [client {client_ip}] HTTP 408 Request Timeout

The clients are all browsers attempting to access the M/Monit server, and the message is not coming from Monit agents.

Anyone have any ideas what I can look for here?

Thanks in advance,

