dotgnu-general
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [DotGNU]GNU.RDF


From: Peter Minten
Subject: Re: [DotGNU]GNU.RDF
Date: Sun, 16 Mar 2003 15:38:13 +0100

Peter Minten wrote:
> 
> Hi folks,
> 
> I'd like to propose a new project: GNU.RDF .

Addition.

First a little vocabulary change. The programs that only have read access to the
store were called agents, now they are called extractors. The new definition for
a agent is an extractor, manager or harvester (agents = extractors + managers +
harvesters).

I've defined a major task list:

--- MAJOR TASK LIST ---

1) Store design

The store is the heart of GNU.RDF, the whole RDF web of a server is stored in
it. The major challenge with the store is creating an efficient database design
to store the RDF web in. Then again a few minutes of thinking revealed this
design (in pseudo SQL):

TABLE triples
{
        subject : subjects.pkey%type;
        predicate : predicate.pkey%type;
        object : object.pkey%type;
}

TABLE subjects
{
        pkey : unsigned long; #2^64 = ~ 18 * 10^18
        uri : strings.pkey%type;
}

TABLE predicate
{
        pkey : unsigned long;
        uri : strings.pkey%type;
}

TABLE object
{
        pkey : unsigned long;
        type : strings.pkey%type;
        data : blob;
}

TABLE strings
{
        pkey : unsigned long;
        string : text;
}

Note that this design is a little optimised for size. One could replace the 5
tables by just the triples one. A lot of uri's would be duplicated however and
since about every uri takes up more space than a long that would mean a lot
bigger database (I'm guessing 2 or 3 times bigger).

Using the blob data type for the data field might not really be efficient,
possibly a replacement by an unsigned long that can be a referal to multiple
tables, depending on what type is. This approach would place strings and numbers
not in blobs offering a speed improvement.

I don't know if this approach of few tables is very good for speed, I figure
however that with due indexing it will work.

2) Agent API

Data will need to be put in and pulled out of the store, in a secure way (only
done by autherised programs). Agents will need to communicate securely with
their clients. These are the goals for the agent API.

Note that the agents will not need to receive complete RDF in response to their
queries but it should be possible. The queries must be specifiable using special
RDF query functions or using plain old SQL (I see a lot of subqueries on the
horizon :-).

3) Basic agents

Some very basic agents will need to be provided.

4) RDF editor

On the client side the RDF web must be browsable and editable (if auth allows
it). To do this a visual RDF web viewer and editor must be created. The editor
must have functions to hide/show certain resources based on the users
preference. At least non-resource properties must be editable using a simple
property table.

5) GNU RDF vocabulary

The GNU RDF vocabulary will be the standardized way to store (nearly) all
information about projects. The root of this namespace will be
'http://gnu.org/rdf'. Note that this does not mean gnu.org must have a rdf
directory, RDF resources using Uniform Resource Identifiers, not Uniform
Resource Locators.

A short list of important classes:

* Person
* Project
* Program
* Package
* Task
* Bug
* FeatureRequest

--- END MAJOR TASK LIST ---

Greetings,

Peter




reply via email to

[Prev in Thread] Current Thread [Next in Thread]