[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: cfagent hangs
From: |
Luke A. Kanies |
Subject: |
Re: cfagent hangs |
Date: |
Thu, 4 Dec 2003 12:26:47 -0600 (CST) |
On Mon, 24 Nov 2003, Jeff Wasilko wrote:
> Hi:
>
> I've been having problems with cfagent hanging for multiple days.
> It's usually started by some sort of network problem (we've had a
> bit of instability here that we've traced down to a failing gigE
> switch).
>
> cfagent is started by cfexecd. Is there any way to get cfexec to
> kill the wedged cfagent?
>
> lexx 7 ># ps -ef | grep cfagent
> root 17435 375 0 Nov 22 ? 0:04
> /is/local/state/cfengine/bin/cfagent
>
> lexx 8 ># truss -p 17435
> recv(8, 0xFFBF2618, 8, 0) (sleeping...)
>
> It seems to be hung in a copy of a big tree (pushing out our
> /usr/local equivilent):
>
> This is the mail I got from cfengine when I killed the hung
> cfagent:
>
> cfengine:lexx: Received signal 15 (SIGTERM) while doing
> [lock.cfagent_conf.lexx.copy._is_dist_pkg__is_dist_pkg]
> cfengine:lexx: Logical start time Sat Nov 22 16:20:34 2003
> cfengine:lexx: This sub-task started really at Sat Nov 22 16:20:34 2003
[obviously, I'm catching up on email]
I had a problem similar to this. It was somehow related to a bad compile
of cfengine and BerkeleyDB; I don't know what went wrong, but eventually
cfagent would hang forever on trying to make locks in the lock_db file.
And I mean forever; I'm talking fork bomb.
It would be nice if cfexecd were configurable to kill child processes
after a certain amount of time; I would settle for a hard-coded value, but
a configurable one would be best. I think an hour is reasonable, but four
might be better for the general case.
This was also version 2.0.8p1, but like I said, it was a bad compile. We
recompiled against 4.0.14 or something and it worked fine. And this was
only on AIX. I also had to go back and delete every db file on every
machine with this problem, as they were all irretrievably corrupt,
apparently.
--
"But these [serious NT security flaws] are not inherent flaws in the
operating system -- they don't happen by accident. They are the result
of deliberate and well-thought-out efforts." --Mike Nash, Microsoft.
The _flaws_ are deliberate?
[Prev in Thread] |
Current Thread |
[Next in Thread] |
- Re: cfagent hangs,
Luke A. Kanies <=