bug-gne
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Bug-gnupedia] Architecture Questions


From: Bryce Harrington
Subject: Re: [Bug-gnupedia] Architecture Questions
Date: Sun, 21 Jan 2001 01:36:16 -0800 (PST)

On 21 Jan 2001, Mike Warren wrote:
> Bryce Harrington <address@hidden> writes:
> 
> > I suppose if I had to make a guess at what the architecture would
> > end up being, it would store the articles themselves as text XML
> > files (DocBook, perhaps), [..]
> 
> TEI has been suggested as well. 

I apparently have not been paying attention.  What is TEI?  I've never
heard of this before.  Is there a URL to more info you could provide?
 
> > When the XML files are added to the repository, converters would be
> > run to produce articles of other formats (text, word doc, pdf, ps,
> > tex, etc.)  When the user requests an article, he or she would also
> > specify the desired format.
> 
> The ease server load, this could even be part of a client or a proxy
> interface to the database (i.e. a proxy grabs the XML and converts it
> to whatever the client wants).

True.  My assumption is that the number of desired formats will be
within constraint, that the converters can be easily identified ahead of
time, and that disk space on the server is plentiful.  All of these
assumptions I believe to be true, and thus a proxy ought to be
unnecessary.  If any of these assumptions are untrue (please explain
why, if they are not), then I would agree that a proxy may make sense. 

> > When producing mirrors of the repository, only the XML files,
> > commentary, makefiles, and conversion scripts and templates would
> > need to be transferred.
> 
> Really, only the XML files and an index. You could make URLs like:
> 
>   http://www.server.org/gnupedia/the-unique-id-of-the-article.xml
> 
> and all that changes for a mirror is the server name. 

Well, you need the commentary too, unless you wish to exclude it (which
I suspect you do not.)  Also, the makefiles, conversion scripts, and
templates make your life sooo much easier; I agree you can do without
them, but what would be the point?  ;-)

> I think the suggestion of a separate classification (index, view,
> etc.) is the best. These could each be on different servers (or even
> mostly-independent projects) and just grab the appropriate XML article
> files from a mirror.

*Nod*
 
> The mirrors might keep a list of recently-added files and a master
> index of all the files they store to make life easy for things
> referencing them, but this would be an extremely simple thing to do
> and would allow concentration on parsers to get content into the
> appropriate format easily, and on the actual DTD/schema of the content
> itself. After these are solidified, more work might be done on making
> the repositories ``nicer'', but I really don't see the advantage of
> using databases at all; at some point, the clients are probably all
> going to want the raw XML, so getting it through a database seems like
> a waste of processing power on the server-side.

Don't underestimate the usefulness of a database.  Especially for
indexing. 

But I do agree as a general principle that content is king.  ;-)

Bryce




reply via email to

[Prev in Thread] Current Thread [Next in Thread]