bug-gne
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Bug-gnupedia] A Detailed Proposal - Mk I


From: Simon Cross
Subject: [Bug-gnupedia] A Detailed Proposal - Mk I
Date: Fri, 19 Jan 2001 04:31:35 +0200 (SAST)

Poing!

So far we (the list) have been discussing important but isolated issues.  
I'm going to present a broad proposal which covers most of the framework
that I feel will be necessary for the encyclopedia to work.  The proposal
is based on four key ideas:  peer review (as used by scientific journals),
digital signing of the article by both the author(s) and reviewer(s),
virtual identities (accounts), and location independent articles.  I ask
that you bare with me if parts of the proposal seem odd initially.  All
should become clear.


Preliminaries
-------------

Peer review:  The idea is that articles are not moderated (in the Slashdot
/ Kuro5hin sense) but rather that a reviewer reads an article and gives it
his/her approval by digitally signing it.  If the reviewer does not
approve of the article for some reason or feels the article needs
corrections then he/she does not sign the article.  Authors will be
responsible for having people review their articles.  When searching the
encyclopedia a user can then specify that only articles reviewed by a
particular set of reviewers (or teams of reviewers) should be searched.  
Or the user could ask that articles reviewed by a certain set of reviewers
be shown first.  Ideally the core of the encyclopedia should eventually
all be reviewed by well-known review teams.

Digital signing:  Peer review is worthless unless the reader can be sure
that the article has been written and reviewed by the people it claims it
has.  Once the authors have completed a draft of an article they sign the
article with their GPG Private Key and put their public key and author
information below the article text.  The reviewers then read the article
and do the same (if they feel the article is good enough).

Virtual identities:  Checking that an article has been signed by the GPG
Private Key associated with a given GPG Public Key is easy.  But how do we
associate a person with their GPG Public Key?  Checking real identities is
time consuming and expensive. VeriSign have made lots of money doing
it.  But we don't need to know who an author or reviewer is, we just need
to know that the same person who reviewed aritcle A also reviewed articles
B, C and D.  Perhaps we also need to know that the person who reviewed
this article is an official member of this-or-that review team (for
example, the GNUPedia review team).

Location independent articles: If we're going to mirror GNUPedia (and we
are) then articles need to be transportable.  An article which has been
submitted to one GNUPedia server has to work on another GNUPedia server
without modification since modifying the article will break the digital
signatures.


An Outline of the Proposal
--------------------------

I propose that two types of servers be setup by the project.  The first
server type will store a list of authors and reviewers.  I will refer to
this as an author/reviewer server.  The second server type will store the
actual articles of the encyclopedia.  I will refer to this as an article
server.

In actual fact the author/reviewer will only store a list of
virtual identities.  The function of the author/reviewer server will be to
allow the article server to confirm that a particular virtual identity has
signed a particular article.  The author/reviewer will store the following
information for each virtual identity:
        - virtual name (account name)
        - a valid email address (for contact purposes)
        - GPG Public Key
        - date of creation of virtual identity (date of birth? *hehe*)
        - the review teams of which a given reviewer is a member 
Virtual entities can then be uniquely identified by a combination of the
author/reviewer server name (e.g. gnu-authors-reviewers.gnu.org), and
virtual identity name (e.g. virtual-simon).

Let me stress at this point that there will be more than one
author/reviewer server.  Preferably lots.  Each author/reviewer server
should be free to decide who they give virtual identities to.  I would
suggest that the encyclopedia start a server which gives out identities to
anyone who wants one (they go to a website, fill out a form and get an
account / virtual id).  This author/reviewer server could also allow
people to group themselves into review teams.

Moving on to the article server. The task of the article server is to
deliver the articles in an HTML format to the reader's browser.  Any
article on any GNUPedia server can be uniquely identified by the following
information:
        - the article server to which the article was originally submitted
        - the server name of the author/reviewer server which holds the
          virtual identity which submitted the article (publishing author)
        - the virtual name of the publishing author
        - the article name as given by the publishing author
        - the article version number
Lets imagine that an article server has just sent an article to a client's
browser.  In this article is a link to another article which looks like
this:
        <a href="/cgi-bin/
        findarticle?
        articleserver= articleserver.gnu.org & 
        authorserver= gnu-authors-reviewers.gnu.org &
        authorname= joesoap &
        articlename= whales &
        articleversion= 2.1
        "> 
The script findarticle (we could provide a GNU version of this) then
decides how best to locate the new article.  If the article server has the
article (if might have articles from other article servers if it mirrors
them) then it can send the article to the browser.  If the article server
does not have the article it can redirect the browser to another article
server of its choice.  How it chooses the next article server is up to it.  
A "default" might be to try locate the article on the server it was first
published but other options are possible.  The articleversion variable
should be optional.  If it is omitted the most recent version of the
article is found.  If it is present then the closest version available is
sent.

The reason that the hyper-link cannot just link straight to the article
relates to the signing of the article by author and reviewers.  Imagine
that an article server A wants to mirror the content of article server
B.  Now A cannot just change all links to articles from B to link to its
copies of the articles since changing the text of an article (the
hyperlink text) will break the digital signatures on the article. 


Specific Issues
---------------
 
- Uploading articles to an article server:

When an article is submitted to an article server the server should check
the following:
        - That all the digital signatures are correct.
        - That all the virtual identities match their GPG Public Keys
        - That all links to local articles are correct.
 In addition I suggest that authors not be able to overright old versions
of their articles.  Firstly, removing old articles has the potential to
casue problems.  Imagine a situation where an author's GPG Private Key is
compromised and someone replaces their 50 3-page articles with 50 1-line
articles filled with address@hidden 5p33k.  Secondly, the knowledge that 
articles
once submitted are not easily removed might lead to authors doing better
checking before submitting.  If old versions of articles need to be
cleaned out then this will have to be done by the article server
administrators.

- An example article:

I imagine an article would look like this:

        <header stuff>

        <begin the article>
        <title>Whales</title>
        
        blah ... blah ... blash
        
        <a href="/cgi-bin/findArticle? ...>
        A link to a GNUPedia article on dolphins</a>

        blah blah

        <a href="http://www.whalewatchers.com/";>
        A link to somewhere else on the web </a>

        <references>
                <reference> reference source 1 </referemce>
                ...
        </references>

        <publishing author info>
                <author name>  </author name>
                <author-reviewer server name>  </author-reviewer server>
                <author signature>  </author signature>
                <author GPG Public Key>  </author GPG Public Key>
        </publishing author info>
        <coauthor info>
                ...
        </coauthor into>
        ... more authors
        
        <end the article>
        <publish author signature> </signature>
        <coauthor signature> </signature>
        ... more author signatures

        <reviewer info>
                ... same info as for author
                but with signature included
                Might also include any
                review teams the reviewer belongs to
        </reviewer info>
        ... more reviewrs

        <finishing stuff>

Only the information between the <begin the article> and the <end the
article> tags is signed so individual article servers are free to put
whatever they like in the header and footer sections.  The article will
most likely be stored as just the stuff between the <begin the article>
and the end of the digital signatures.  The header and footer will be
generated when the article is served (sent to the browser).  The author's
identities are included in the signed section of the article since this
will prevent author's being removed or added without breaking the
signatures.  The reviewer's information is kept outside the signed area to
allow reviewers to review articles after the authors have signed them.

- Non-HTML content

Provision will have to be made for images, video clips, sound files and
other media to be included in encyclopedia articles.  These also need to
be signed by the authors and reviewers if possible.  Some standard method
for doing this needs to be divised.  Get cracking. :) If you like the
proposal so far. :)

- Updating GPG Keys

The validity of signatures will be checked when then article is submitted
to the article server so the aging of keys is not that much of an
issue.  If a GPG key is compromised then it will have to be updated.  It
might be worthwhile specificy that author/review servers should keep old
keys so that an article server can request the GPG Public key that the
author was using on a specific date.  No one should need to change their
GPG Key more than once a year so storing old keys shouldn't be a problem.

- Why Slashdot-like Moderation Won't Work

GNUPedia is not a news site.  It's an encyclopedia.  Articles
have to stay up pretty much permanently.  Many articles won't be read by
more than 10 people a year.  Having content disappear randomly (because
someone moderated it down too far) is unacceptable.  Most people don't
know enough to moderate more than 1% of encyclopedia entries.  I rest my
case.


Unrelated Points
----------------

- Copyright problems:  If we run into copyright problems we remove any
articles which we agree have infringed on someones copyright.  If we feel
the person claiming copyright is in the wrong we keep the
article.  Problem solved.

- Submitting articles in other formats:  We support article submission in
as many formats as possible.  We store articles as XML/HTML.

- Article Structure:  As Robert Price pointed out, we should impose strict
struture on the HTML used in the articles.  No browser specific tags, no
dodgy HTML practices.

- No Refusal Policy: Thumbs up to the no refusal policy.  Unless the
formatting is wrong.  Or the signatures are broken.  Or the article
infringes on someones copyright.  No refusal of articles except on
technical grounds.

- Hector's statement that peer review can wait until later:  I don't buy
it.  We need peer review from the start.  From now.  From yesterday. :)

- MathML: W3C endorsed the MathML 2.0 specification a week or two
back.  Do any browsers support MathML 2.0 ?  What is MathML 2.0
like?  Will it suit the needs of GNUPedia ?  Bare in mind that we probably
won't need too many formula in encyclopedia.


The END
-------

Well, that's my piece for the day.  I'll try and answer queries on this
lot if there are any but I'll only get around to it later.  I need a
break. :)  Happy scheming.  Long live GNUPedia.

Ciao
Simon Cross

--- Imagine there's no heaven.  It's easy if you try.  
    No hell below us.  Above us only sky.
                           John Lennon, Imagine.  ---

[ email:        address@hidden
  tel:  (c) 4979 486 380        (w) 4072 056 (120)
  Information reversed to foil spambots ]




reply via email to

[Prev in Thread] Current Thread [Next in Thread]