dotgnu-general
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [DotGNU]Implement RDF in a Universal Data Structure


From: Peter Minten
Subject: Re: [DotGNU]Implement RDF in a Universal Data Structure
Date: Fri, 28 Mar 2003 08:24:24 +0100

Seth Johnson wrote:
> 
> RDF benefits from the generalization that knowledge can be represented by
> assertions of a common, universal form.  All assertions have a subject, a
> predicate and an object.  This generalization enables the creation of tools
> that draw inferences from knowledge represented in this common form.
> 
> I want to take this to a higher level.  Instead of stopping at the assertion
> structure of RDF, I propose that we implement a similarly universal
> generalization about relations (among data entities).  This generalization
> reflects the purpose of representing what the most fundamental universal
> data structure is, which supports all the information access and retrieval
> functions necessary for any application (and any web service).
> 
> RDF represents relations as a series of statements.  It's really a form of
> many-to-many key table.  We can put RDF into this universal data structure,
> along with anything else we please, because putting things into the
> universal data structure makes interoperability implicit and automatic.
> 
> The universal data structure represents all relations as Use Types that are
> related to Link Types, each of which may be particularized into specific
> Uses and Links.  A particular Use of a certain Use Type represents the
> parent record of a relation, and the particular Links of a certain Link Type
> represent the children records related to that parent record:
> 
>    Use Type: Shopping Cart
>    Link Type: Products Selected
>    Use: Seth's Cart
>    Links: Soap, Shampoo, Milk, Butter, etc.
> 
> (This is only a piece of the structure, but it represents the core
> generalization that all the rest stems from)
>
> Once you have a universal data structure, you can define a fundamental
> protocol that says everything you need to know about any application, and
> you can store everything for all applications in one universal structure
> that inherently lets all elements in any particular such type of relation,
> be used freely in any other such type of relation.
> 
> I refer to this fundamental relation as a "context."  It can also be
> referred to as an atomic application.  A context is an extended version of
> the traditional idea of relations among data entities, turning that concept
> into the core of the idea of what an atomic universal application is
> necessarily made up of and must be able to do.  More complex applications
> are simply made by combining such atomic contexts.
> 
> RDF can be stored in this data structure as follows:
> 
>    Use Type: Subject
>    Link Type: (Various predicates, like "has" "contains," etc.)
>    Use: Whatever particular subject
>    Links: Whatever particular "objects" asserted to relate in the link type
> way to the particular subject.
> 
>    Use Type: Subject
>    LinkType: Has
>    Use: Seth Johnson
>    Links: arms, legs, a receding forehead
> 
> What you can do with this is generalize about the universal functions that
> must be built into such a representation of a universal, atomic
> application.  This includes the query functions that the RDF area focuses
> on.
> 
> Build this into DotGNU.  Make a language that speaks in terms of these
> abstractions.  I call the language CCL, or Context Control Language, and I
> call the basic structure of a context "packet" or "message" CTP, or Context
> Transfer Protocol.  CTP can either be defined as something immediately above
> TCP and immediately below the application layer, in a binary way, or we
> could define it as something correlative with HTTP, in a more textual way.
> 
> There's more to it, but maybe this ramble will interest some of you . . .

(a good idea deserves a long answer)

Hmm, it's an interesting idea. The interesting here is that instead of RDF
properties that are often translated to predicates you use links. On the
technical level there is no real difference though. RDF properties can easily
express all kinds of links in the world.

In the infosphere (the world of info, cyberspace is a subset of it) everything
can be expressed using subject-link-object terminology. So everything can be
expressed as RDF. Or more exactly everything can be expressed as a web. I think
this is a very cool idea, especially since it allows us to store all possible
information in a web.

RDF is not sufficient however for all purposes though due problems with it's
representation, it's usually written down as XML or triples that lack
flexibility. A more serious problem is the need to name everything that you want
to refer to with an unique uri. IMHO it should also be possible to refer to
something by indirection, I mean to refer to the value of a property of a
resource instead of directly to a resource. This will also serve to store more
meaningful info than in RDF. It's often more interesting to know to get to
something (using a link path) than where it is.

With link path I mean a path starting at a certain resource that goes to the
target resource. Link paths will often start at 'this' resource. Link paths
allow flexibility, if you have something 10 resources on the path away from you
and something changes at step 4 then you have a good chance that the resource
you're hardlinked to is not the right one anymore, but that the resource the
path refers to is.

Problem with link paths is of course that they tend to split at collections
where multiple routes are possible, but I think that's a solvable problem.

The moral: we need to develop a good link path protocol. Here is a start:

A link path is a dot path like in OO languages, it starts with a resource, all
following parts are properties. The resource name 'this' refers to the subject
of the property which object is the link path. An example:
'this.contact:author.foaf:lastname = "Minten"'

Link paths can only contain URI's in pointy brackets (< >).
 
Link paths can contain requirements for properties by enclosing it in accolades,
link paths can contain assumptions about the value of a property by putting it
behind a double @ in the accolades. For example:

'<http://dotgnu.org/people>{lastname="Minten"@@http://dotgnu.org/people/mdupont}.mailbox'

The example is slightly misleading. It recommends trying the condition (property
lastname of a resource in the collection is "Minten") on mdupont first, but
since that will fail ("DuPont" != "Minten") it will try out the condition on
every resource in the collection. I could have added extra conditions by putting
'lastname="Minten"' between parenthesises, putting & behind it and adding
another condition between parenthesises after it.

Link paths can become quite large and full of conditions, however they're the
only way I see to safely keep links between things far away from eachother.

Link paths should not be treated as URI's by the RDF server, but as link paths
(the type designator 'p' in the GNU.RDF store design is hereby reserved for link
paths).

--

About the CTP. I'm thinking of a fast binary protocol here. To ensure the speed
of the semantic web GNU.RDF.QL and it's older brothers are out of the question
as the main protocol between semantic web servers. The protocol should however
support link paths as they will be one of the principle ways to travel around
the semantic web. The protocol should also be as small as reasonably possible,
but it should not need to be decompressed.

If we do things right it should be possible to store things in RDF files that
use the fast binary notation instead of the rather cumbersome XML one. Or yet
even better a fast system RDF database in the kernel. Anyone for kernel module
hacking? :-)

Greetings,

Peter



reply via email to

[Prev in Thread] Current Thread [Next in Thread]