savannah-hackers
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Savannah-hackers] Re: cron problem on fencepost


From: Joel N. Weber II
Subject: [Savannah-hackers] Re: cron problem on fencepost
Date: Thu, 18 Oct 2001 13:54:16 -0400

   I had a report on savannah about a list that had not yet been created in
   two days. I've seen that /usr/local/bin/mailing_lists_create.pl had not
   been executed these last two days. It should have been launched by
   /com/sys/cron/hourly. Therefore I suspect there is a cron problem since
   launching this file by hand reported no problems...

Yep.  The ssh process that pushes the aliases file to delysid had hung
two days ago; I just killed it off, but I haven't fixed the real
problem, so it will recur sooner or later.

The failure mode had been that fencepost reported an ESTABLISHED
connection in netstat -an output, yet delysid didn't have any such
connection.  Apparently, -o KeepAlive=yes doesn't improve the
situation either, which puzzles me a bit.

in /proc/sys/net/ipv4, fencepost has tcp_keepalive_time 7200 and
tcp_keepalive_probes 9; I was guessing that that meant that the ssh
process would be able to die in 18 hours, but apparently that was not
the case.

This started becoming a problem in the last few weeks because I added
locking so that if one hourly run starts when another is still
running, it will just complain rather than actually running.  It had
previously been a problem in that ssh client processes would pile up
on fencepost.




reply via email to

[Prev in Thread] Current Thread [Next in Thread]