[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Pan-devel] Re: ancient DB schema

From: Duncan
Subject: [Pan-devel] Re: ancient DB schema
Date: Sat, 05 Jun 2004 08:39:09 -0700
User-agent: Pan/ (As She Crawled Across the Table)

K. Haley posted <address@hidden>, excerpted below,  on Sat,
05 Jun 2004 00:30:52 -0600:

> Unresolved: should we handle multiple hostname+port pairs? (just remember
> to seperate servers that handle different groups.)

That's an interesting idea..  My ISP (Cox Cable) has three servers
number-synced between them, and carrying the same groups, altho there are
slight retention differences and there's one per region, so some
distance variation.  Multiple host/port pairs per server would allow me to
put them all on the same server.

OTOH, I'm not sure it's worth the trouble, particularly once we get
automated multi-server handling, as it'd work just as well to make them
three separate server entries as I do now (and that'd solve any minor
retention difference problems as well).

Perhaps a better use of such a feature would be "virtual" servers, which
would display multiple servers as one but continue to track them

In fact, as I think about it, perhaps the most efficient way to do it
would be to support virtual servers almost from the start, by splitting
the structure such that the group contents of physical servers are never
really displayed at all (except perhaps in a diagnostic screen), only
back-end tracked, with the normal front-end display being entirely
separated as a a virtual server, and the plumbing between the two, save
for configuration, handled entirely transparently.  

This would allow flexibility in a number of ways.

In the simple case, single physical server, single view of it (virtual
server), it would function much as it does now.

I believe it's fairly common among Usenet junkies, even when there's only
one physical server available, to split it into multiple logical servers,
providing at least one level of hierarchal nesting.  Thus, one may have a
"pix" server, a "movez" server, a "isp" server, a "linux" server, an "ms"
server, etc, all pointing at the same physical server.

With the split virtual vs physical server arrangement, this would be quite
easy to control, as each would simply point to the same physical server. 
The bonus here is that unlike the current setup where each is actually
stored as a fully separate server that just happens to have the same
hostname as the others,  it would remain a single physical server on the
backend, and could limit connections AS a single physical server.  Thus,
if you set up a bunch of "movez" to d/l and then switched to your
"linucks" server to catch up on the discussion on your favorite distrib,
the backend server would know they were both off the same physical server
and take a connection away from the "movez" download to update the
"linucks" server, but ONLY as long as it was necessary to do so, returning
the connection to "movez" d/ling while you composed a reply, for instance.

Likewise, the virtual servers being only views on physical servers, could
collate the view of the same group on multiple physical servers,
prioritizing physical server fetch, but with each physical server back-end
still managing it's own connection limits.

In this scenario, the physical server backends could be "dumb", not
knowing or caring how many virtual servers were hooked to it.  It would
have to track max connections and etc, but wouldn't even have to track
rank, as the virtual servers would manage that.  This would keep the
physical server records almost of fixed size (assuming one set a
reasonable max size for the various strings, they COULD be fixed length

Each virtual server, then, would have all the complicated variable length
stuff, as each could have multiple physical server pointers, each of which
would have to contain not only the physical server pointer, but a
hash/bitfield of the displayed subscribed groups actually carried by that
server, AND a priority level.

Most likely the simplest way to keep things modular and complexity down
would be to have a third piece, the middleware between the two servers. 
This would likely be entirely invisible to the user, but would handle the
job of queuing the requests from the virtual server to the physical
server.  Depending on implementation, (1) the virtual server could do the
prioritizing and hand it off to the (dumb) middleware which would queue
from an individual virtual server to an individual physical server (with
the physical server then handling multiple middleware queues to do the
actual downloading as necessary), or (2) the virtual server could do
little more than display, while the (smart) middleware handled the
prioritizing and handing off to the backends, or (3) the (power-ware)
middleware could handle virtually all the prioritizing and queuing,
leaving a dumb fetcher physical server (little more than a wrapper for
gnet, but enforcing the per-server connection limits), and a dumb-display
virtual server.

As I think about it, the display aka virtual server would be a single
thread, with multiple fetcher aka physical server threads possible, each
with multiple connections possible, and multiple middleware threads as
well, whether they do all the work leaving the physical servers as
little more than conduits, or whether the physical servers do most of the
work leaving the middleware as little more than conduits.

Of course, something like that will certainly mean a rather more intense
rewrite of PAN as we know it, than would a more conservative approach, but
it's quite likely, IMO, that PAN will never really be able to handle
multi-servers /well/, until such a modularized approach is taken.

(Wow!  When I started, I didn't know I'd end up HERE!  <g>)

Duncan - List replies preferred.   No HTML msgs.
"They that can give up essential liberty to obtain a little
temporary safety, deserve neither liberty nor safety." --
Benjamin Franklin

reply via email to

[Prev in Thread] Current Thread [Next in Thread]