sks-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Sks-devel] Introduction + some ideas ...


From: Phil Pennock
Subject: Re: [Sks-devel] Introduction + some ideas ...
Date: Thu, 7 Oct 2010 06:53:09 -0400

On 2010-10-03 at 16:02 +0200, Sebastian Urbach wrote:
> Problem 2:
> 
> Pool checks are, as i understand, just performed once for
> about 12 hours ? If someone has a short problem. lets say reboot or
> something during the check, he will be excluded from the pool for at
> least 12 hours, as it happened with our keyserver a few days ago.

Is this actually a problem?  As long as there are enough hosts in the
pool to distribute the load, why does it matter if one particular host
is or is not listed?  This isn't a farm where we have N+2 capacity,
there are plenty of servers.

> "The key differ value" is just based on your own number +/- afew
> hundred as i recall. That seems to be a problem because the servers
> have very different statistic times in reality and that could be result
> in a difference that is too large by the time your value kicks in.
> 
> Solution:
> 
> Just sum up all key numbers from the servers which are in the pool by
> that time and divide it through the number of keyservers. It could be
> posted on the status page so that everbody can read it. Maybe like:
> 
> Total key number (all servers) / number of servers = xyz
> 
> You should be in the range between xyz and xyz keys to be included in
> the pool.

The problem with this approach is that some servers report 0 (or -1?)
keys, until they scan, so you'd drag down the average enough to include
stale servers.  A low-pass filter can strip out those servers first.

Once that's done, assume that all servers are up-to-date, within 100
keys of each other.  The average will be somewhere in the middle, and
filtering on >= average will skip perfectly good servers.

Actually, it's rather harder than it first appears to come up with an
algorithm which includes all servers which are "not broken".  The best
I've come up with is to bucket the servers (currently buckets of 3000),
take the mode bucket, discard all servers more than 5 stddev.  Then find
the standard deviation of the remaining servers, take the
second-most-populated bucket, subtract this stddev and a constant
kDAILY_KEYJITTER and then use all servers larger than that.  The
kDAILY_KEYJITTER is chosen to deal with a server restart suddenly making
a keyserver report more keys, and is 500 which is about the max number
of new entries normally seen in a day.

There's a lot of juju which went into finding that bucketing and it's
still not good.  There are clearly better algorithms.  But what Kristian
is using is very simple and rather clean, *provided* the reference
server is up-to-date.

You can see the calculations I mention applied to the current keyserver
population in the STATS: debug comments at the first URL here:
  http://sks.spodhuis.org/sks-peers/ip-valid-stats
  http://sks.spodhuis.org/sks-peers/ip-valid

Note the histogram has *two* peaks, and this does not include servers
reporting 0.  (And the rows are not contiguous)

-Phil



reply via email to

[Prev in Thread] Current Thread [Next in Thread]