fenfire-dev
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Fenfire-dev] PEG: Pointers overview


From: Benja Fallenstein
Subject: [Fenfire-dev] PEG: Pointers overview
Date: Wed, 20 Aug 2003 18:05:28 +0200

=================
Pointers overview
=================

:Author:  Benja Fallenstein
:Created: 2003-08-20
:Changed: $Date$
:Status:  Incomplete
:Scope:   Major
:Type:    Architecture


By nature, Storm blocks can not be modified, as they are identified
by their cryptographic hash. But in order for Storm to be useful as a
backend for document storage, there must be a way to create
updateable documents.

The plan has always been to have *pointers*: A URI that can point
to different Storm blocks over time. So every time you change a document,
your computer creates a new Storm block containing the new version
of the document, and makes the document's pointer point to that block.

There have been prototype implementations of pointers before, but 
none of them was good enough to make a persistency commitment on.
It's important that:

- Pointers have "owners" and only a pointer's owner can change that
  pointer's target. This is important for e.g. Web pages.
  (Of course, everybody can create their own pointer to point to
  their own version of a document.)
- Full history of all documents is kept (except that you can delete 
  old versions, e.g. to save space).
- If you have two machines, data is easily synchronized between them.
  (If you changed a document on both machines, Storm should indicate
  this to you, and allow you to use diff/merge tools.)
- When you publish documents on the Internet, they retain the names
  they had on your machine; if you made links between them, these
  links continue to work unchanged.
- When you download documents from the Internet and store them locally
  on your machine, they similarly retain their names and links similarly
  continue to work.
- Same when you send or receive documents by mail, or when you follow
  a link from a public Web page to a private document, etc. Even when
  you exchange documents on floppy disks, verification works the same way.
- Group work is possible; a pointer can be owned by a group of people,
  and members can be added to and removed from a group. When different
  people have edited a document concurrently, Storm indicates that;
  it should be possible to use diff and merge tools to reconcile the 
  alternative versions.
- Old versions are accessible as long as *anybody* keeps a copy--
  it doesn't have to be the original author. Even the author can not 
  remove previous versions from the history of a document.

I think that now I finally have a system that meets this goals 
well enough to be workable. This PEG gives an overview over this system;
following PEGs will specify the details.



Issues
======

.. None yet.



Model
=====

Pointers, under this proposal, are URIs that authoritative RDF triples
can be associated with. These triples have the pointer as their subject
and can have any URI as their property and value.

The triples are append-only: The pointer's owner can add, but not
delete triples.

Additionally, every triple is associated with a timestamp of the time
that the triple was created. Details in another PEG.

This model gives great flexibility in the way that pointers can be used.
The simplest case are documents, as described above. The triples
associated with a document pointer have the form, ::

    <pointerURI>   ptr:hasVersion   <refURI>

The ``refURI`` is a URI of the kind specified in `ref_uris--benja`__.
It identifies a version of the document.

__ ../ref_uris--benja/peg.gen.html

The *current* version is the one whose ``hasVersion`` triple
has the most recent timestamp. The rationale for this decision is
explained below (see `Ghost versions`_ section).

The triple model allows different people in a group, or different
machines owned by the same person, to add triples to a pointer
independently. Thus, if you edit the same document on two machines,
each of the machines can independently add new versions. Additionally,
the triple model allows individual past versions to be deleted
without impacting other versions.

This is useful for other things than documents, too (which is the reason
for choosing the extensible triple model). 

For example, a pointer may represent my mailbox. Each individual mail
I have received could be put into a block and associated with my
mailbox through a hypothetical ``mbox:contains`` property. I could then
delete some of the mails without impacting the others. (If, instead,
I used a document pointer that points to a list of mails currently
in my mbox, then I couldn't delete one mail without creating a whole
new list of mails.)

As another example, imagine a group of people collaboratively posting
items on a blog. The block could be a pointer which is associated
with blog items through a ``blog:item`` property. This allows each
blogger's system to create blog items independently. (If the group
used a document pointer that points to a list of blog items, then
two people creating blog items simultaneously would create alternative
versions of the document, which would have to be merged.)

Additionally, the pointer system uses pointer triples for a couple
of internal tasks, to be specified in future PEGs.



Overview of internals
=====================

The problem, then, is how to associate triples with pointers in a way
that is authenticated (the system verifies a given triple was really
signed by the corresponding pointer's owner) as well as permanent
(once a triple is created, the pointer's owner cannot take it back,
as long as someone keeps a copy).

Originally, the idea was to use public key/private key 
digital signatures: Each pointer owner would have a cryptographic key, 
and would sign triples about their pointers with this key.

However, digital signatures by themselves aren't really permanent,
because you cannot rely on the private key to be kept secret forever.
Someone may steal the key, or be able to compute the private key
from the public key through a cryptoanalytic attack.

A technique for increasing the lifetime of digital signatures is
to timestamp them. However, this technology is heavily patented
and thus we cannot use it.

I have developed an alternative system. The basic idea is described
in my dart, `deputy_based_signatures--benja`__. Future PEGs will give
the details of the implementation.

__ ../../dartboard/deputy_based_signatures--benja/idea.gen.html



Ghost versions
==============

One important point about pointers is that to support groupwork,
when two people edit a document independently, when they synchronize
Storm should notice and offer to merge the alternative versions.

Before, we have archieved this by giving, with each version of a pointer,
a couple of obsoleted previous versions of the pointer. A version
would be current if it weren't obsoleted by any other version. 

So if you create three versions of a document in succession, then the first
version would be obsoleted by the second version, and the second version
would be obsoleted by the third version; the third version would
thus be current.

But what happens if the second version were deleted? The third version
would only say that the second version was obsoleted; nothing would say
that the first version was obsoleted, too. Thus, the first version
would suddenly seem current, again.

I call this a *ghost version*: An old version from the past that
came back to haunt you. ;-)

A similar situation occurs if in a whim, you create two alternative
versions of a document, and store one of them on a floppy disk
but not on your main harddisk. Years later, you rediscover the
floppy disk and copy its contents onto your new computer.
As the version on the floppy disk, long forgotten, was never obsoleted
by any other version (because it was removed from the harddisk before
a successive version was created), it will appear to be "current."

Again, a version risen from the dead.

Ghost versions are particularly annoying when a document is published
on the Internet. If someone discovers a ghost version and puts it online,
it will seem to your readers as if there were two alternative current
versions of your document-- although one of them is really but a relict
from the past.

In order to avoid this problem, this PEG specifies that only the version
with the most current timestamp is considered "current." (The "current"
version is the one shown when you type a pointer URI into your browser 
or follow a link.)

The problem, of course, is that if two people concurrently edit
a document, only one of their versions will be considered "current."

To alleviate this problem, we *additionally* allow versions to obsolete
each other, very similar to before. Recall that each version is represented
by a ``ref`` URI, which implies that there is a block with authoritative
metadata about the version. This block can contain triples of the form, ::

    <currentVersion>   ptr:obsoletes   <pastVersion>

(The ``pastVersion`` is a ``ref`` URI, and the ``currentVersion``
is the blank node that represents "this" ``ref`` URI; see
`ref_uris--benja`__.)

__ ../ref_uris--benja/peg.gen.html

These triples aren't used by Storm to determine what is shown
when someone enters a pointer URI into their browser. However, they
*are* used for showing the *history* of a pointer; they may be used
when Storm synchronizes two repositories; and the may be used when
someone *with change control* over a particular pointer requests
the current version of that pointer, so that they know that alternative
versions of the pointer exist and need resolution.

Sometimes, it may happen that Storm rings a "false alarm" because of
a ghost version. However, this will only happen to the owner of the
pointer-- someone who has change control over it and can fix the problem.
For the pointers of others, which you serve on the net, ghost versions
will not impact you (unless you look at their history to see
whether there are alternative versions).



Vocabulary defined in this PEG
==============================

This PEG defines the following URIs:

http://purl.oclc.org/NET/storm/vocab/pointer/hasVersion
    A property. The subject is a document and the object
    is a version of this document.

    Usually used in authoritative triples about pointers,
    as described in this PEG.    

http://purl.oclc.org/NET/storm/vocab/pointer/obsoletes
    A property. Both the subject and the object are versions
    of a document. The property states that the subject
    supersedes the object.

\- Benja 





reply via email to

[Prev in Thread] Current Thread [Next in Thread]