[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Mail posting in newsgroups in Gnus

From: Bob Proulx
Subject: Re: Mail posting in newsgroups in Gnus
Date: Sun, 15 May 2016 16:25:17 -0600
User-agent: Mutt/1.5.24 (2015-08-30)

Eli Zaretskii wrote:
> Lars Magne Ingebrigtsen wrote:
> > The SMTP servers have been timing out on incoming mail for some
> > people the last couple of days, whether you're posting through Gmane or
> > not.  But some seem to make it through, for some reason...
> > 
> > Or perhaps it's just email from Norway?  I wonder whether this one make
> > it through...  
> machines have connectivity problems lately.  I hope they will
> be resolved soon.

Since Friday the 6th there was a large move with various assorted
related and unrelated changes.  Which has made it difficult to tell
what broke what.  I am deep in the middle of *.savannah and can give
the most information about it.

The * machines moved to a different host server in
the same data center.  But their file systems got repackaged and the
copy process had problems.  1) All of the system uids were mapped to
the host used to do the copy.  All of those had to be renamed back to
their proper uids.  Fixed that.  2) All of the ACLs were lost.  This
prevented the web team from being able to access files.  Fixed that.
3) Swap was lost.  Causing OOM activity.  Fixed that. 4) Additionally
a very new kernel from the host system was booted instead of the
system one.  That caused a subset of networking problems.  The managed
switch was reporting a continuous high rate of dropped packets.
Reverting back to the system kernel fixed the switch's reports of
dropped packets.  5) This happened late on a Friday without notice
meaning that we had to scramble around over the weekend to find and
fix the problems.  6) Long term there has been persistent problems
with stuck connections to git, cvs, bzr, hg daemons.  Since git is by
far most popular we mostly hear about git problems but it happens with
the others too.  The stuck daemons stack up consuming slots until all
of the xinetd slots are filled and it hits the limit at 100 processes
running.  At that point no new connections can be made.  This has been
made worse by the Friday network move.  Something is now tickly this
problem agressively.  We are manually monitoring this and mitigating
it by killing stale processes.  We hope this to be fixed as soon as we
can migrate onto the new operating sytsem VMs promised to us any day
now for the last two years.

On the same day the entire Boston FSF network was changed to a
different network routing of which I know little.  I don't have much
visibility into this change.  traceroute shows a different route now.
This changed for every system.  People are reporting many problems
across all of the system.  There are reports of differences seen
between IPv4 and IPv6.

Except for the stuck process problem I think the Savannah systems are
running within nominal limits.  It is "normal".  For Savannah anyway.
(Not to say things don't need fixing.  But we are waiting in the
pipeline for a new VM that we can migrate onto.)  However all of the
random network problems people are reporting include fencepost, eggs
(mail relay), (mailing list),, and many
others.  Those don't have anything in common except for the networking
and all are suffering.

If you are suffering problems I encourage a trouble report being
made.  Please include details.  Say where from and where to.  Say
whether it was IPv4 or IPv6.  Time the problem occurred.  Then if
later it works update and say that.  Because one of the problems is
that this seems worse from Europe than from the US.  I in Colorado
have a hard time triggering any problems.  But I have been able to on
occassion.  But people in Europe have been most of the problem
reports.  Most of them have been using IPv6.

And that is all I know.


reply via email to

[Prev in Thread] Current Thread [Next in Thread]