[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Gzz-commits] manuscripts/storm short-paper.rst
From: |
Benja Fallenstein |
Subject: |
[Gzz-commits] manuscripts/storm short-paper.rst |
Date: |
Mon, 26 May 2003 21:52:38 -0400 |
CVSROOT: /cvsroot/gzz
Module name: manuscripts
Changes by: Benja Fallenstein <address@hidden> 03/05/26 21:52:38
Modified files:
storm : short-paper.rst
Log message:
restart short paper after realizing what the main point needs to be
CVSWeb URLs:
http://savannah.gnu.org/cgi-bin/viewcvs/gzz/manuscripts/storm/short-paper.rst.diff?tr1=1.5&tr2=1.6&r1=text&r2=text
Patches:
Index: manuscripts/storm/short-paper.rst
diff -u manuscripts/storm/short-paper.rst:1.5
manuscripts/storm/short-paper.rst:1.6
--- manuscripts/storm/short-paper.rst:1.5 Sun May 25 07:40:49 2003
+++ manuscripts/storm/short-paper.rst Mon May 26 21:52:38 2003
@@ -1,57 +1,61 @@
-========================================================================
-Storm: Supporting data mobility through location-independent identifiers
-========================================================================
-
-.. Main point of this paper:
- Location-independent identifiers support data mobility;
- DHT allows location-independent identifiers
+====================================================
+Storm: Using P2P to make the desktop part of the Web
+====================================================
Abstract
========
-- data mobility
-- problems
-- location-independent identifiers such as hashes
-- resolvable through DHT
-- our implementation (Storm) is beginning to be deployed
-
-.. In this paper, we define data mobility as a collective term for the
- movement of documents between computers, different locations
- on one computer and movement of content between documents.
- We identify dangling links and alternative versions as major
- obstacles for the free movement of data. This paper presents the Storm
- (STORage Module) design as one possible solution to these problems.
- Storm uses location-independent globally unique
- identifiers, append-and-delete-only storage and peer-to-peer networking to
- resolve problems raised by data mobility. Moreover, we discuss some
- specific use scenarios related to ad hoc networks, unreliable network
- connections and mobile computing, in which the need for data mobility
- is obvious. Our current prototype implementation works on a single system;
- peer-to-peer networking is in an early prototype stage.
-
-.. raw:: latex
-
- \category{H.5.4}{Information Interfaces and
Presentation}{Hy\-per\-text/Hy\-permedia}[architectures]
- \category{H.3.4}{Information Storage and Re\-trie\-val}{Systems and
Software}[distributed systems, information networks]
-
- \terms{Design, Reliability, Performance}
-
- \keywords{versioned hypermedia, dangling links,
- peer-to-peer,
- location-independent identifiers}
+Linking personal documents like we link Web pages is inconvenient
+enough that users rarely ever do it. A major reason is that
+links break when documents are moved around or sent by mail.
+We argue that while non-breaking links would be a convenience
+on the Web, they are a necessity for making Web-like hyperlinks
+useful on the local desktop.
+
+We propose Storm, a storage system identifying documents by
+cryptographic hashes and signatures, independently of their
+location. Our system automatically finds link targets wherever
+they are, on the local system or on the network.
+On the network, our identifiers are resolved
+through a peer-to-peer distributed hashtable.
+Thus, links continue to work unchanged when documents are emailed
+or published on the network.
+
+Our system uses URIs to integrate with the Web. We have so far
+extended KDE and Netscape Communicator 4 to understand
+our experimental URN namespace. Most other systems can use Storm
+through an HTTP gateway.
Introduction
============
+.. documents can be linked like web pages,
+ which would make them part of the web:
+
+Documents written with OpenOffice or Microsoft Word
+can nowadays be hyperlinked just like web pages--
+but nobody does it. XXX
+
+.. links needed that don't break when documents are moved:
+
+.. using location-independent identifiers for
+ non-breaking links:
+
+.. non-breaking links seem not globally resolvable:
+
Several hypermedia systems assume that identifiers either have to
-include location information or cannot be resolved globally.
+say where a document can be found on the network, or they
+cannot be resolved globally.
URLs, location-dependent identifiers, break when documents are
moved. Link services often query only a select set of link
servers, not the whole network [hill94extending-andalso-carr95dls]_.
-Berners-Lee [name-myth]_ argues that unique random identifiers
-are not globally feasible for this reason.
+Berners-Lee [name-myth]_ argues that for this reason,
+using unique, random-looking numbers to identify documents
+is not possible on a global scale.
+
+.. but DHTs can do it:
However, recent developments in peer-to-peer systems have
rendered this assumption obsolete. Structured overlay networks
@@ -62,158 +66,22 @@
This, we believe, may be the most important result of peer-to-peer
research with regard to hypermedia.
-- location-dependent identifiers cause broken links
-
-- alternative versions on independent systems hard to synchronize
+.. Freenet's cryptographical identifiers:
-- creating a location-independent namespace, resolve through DHT
+.. structure of this paper:
-Storm block storage
-===================
-
-.. as blocks, independent of network location:
-
-In Storm, all data is stored
-as *blocks*, immutable byte sequences identified by a SHA-1
-cryptographic content hash [fips-sha1]_.
-Purely a function of a block's content, block ids
-are completely independent of network location.
-
-.. similar to files, but immutable:
-
-Blocks are similar to files, but they cannot be modified.
-Any change in the data would cause the identifier to change too.
-
-.. identifiers self-certifying:
-
-Storing data in immutable blocks
-has a number of advantages. Firstly, it makes identifiers
-self-certifying.
-
-After downloading a block, we are can check whether the data
-matches the cryptographic hash in the identifier.
-Therefore, we can safely download blocks from an untrusted peer.
-
-.. link targets cannot be changed on us:
-
-When we make a reference to a block, we can be sure
-that even the original author of the target will not be able
-to change it (unlike with e.g. digital signatures).
-For example, if a newspaper refers to a letter
-to the editor this way, the letter's sender won't be able to change
-the reference into an advertisement for a pornographic web page.
-
-.. caching trivial:
-
-Secondly, caching becomes trivial, since it is
-never necessary to check for new versions of blocks.
-
-.. flash crowds alleviated:
-
-If peers make the blocks in their caches available on the network,
-the flash crowd problem could be alleviated: The more users
-request a block, the more locations there are to download it from.
-This resembles e.g. the Squirrel
-web cache [iyer02squirrel]_; however, downloads can be
-from *any* peer since the source does not need to be trusted.
-On the other hand, there are privacy
-concerns with exposing one's cache to the outside world.
-
-.. replication easy:
-
-To replicate all data from computer A
-on computer B, it suffices to copy all blocks from A to B that B
-does not already store. This can be done through a simple 'copy'
-command. Different versions of a single document
-can coexist on the same system without naming conflicts, since
-each version will be stored in its own block with its own id.
-
-.. web links resolvable to local copies:
-
-The same namespace is used for local data and data
-retrieved from the network. When an online document has been
-permanently downloaded to the local harddisk, it can be found
-by a browser just as data from the network. This is convenient
-for offline browsing, for example in mobile environments:
-After a block has been downloaded, references to it will *never*
-cease to work, online or offline.
-
-.. append-only, bugs don't lose old data:
-
-Thirdly, immutable blocks increase *reliability*.
-When saving a document, an application will only *add* blocks,
-never overwrite existing data. When a bug causes an application
-to write malformed data, only the changes from one session
-will be lost; the previous version of the data will still
-be accessible. This makes Storm well suited as a basis
-for implementing experimental projects (such as ours, Gzz).
-Even production systems occasionally corrupt existing data
-when an overwriting save operation goes awry; for example,
-one of the authors has had this problem with
-Microsoft Word many times.
-
-.. (was footnote) Unfortunately, efficient versioned storage (Section 6)
- makes matters a little more complicated; still,
- the basic assertion holds.
-
-.. mirrors trivial:
-
-Links to a block will work as long as *any* peer
-holds a copy. Thus, providing mirrors is trivial.
-Even after failure of all dedicated mirrors,
-a document may still be available from peers that have
-downloaded it. An archive of published blocks, in the spirit
-of the Web archive [waybackmachine]_, would only be yet another backup:
-normal links to a block would work as long as the archive
-holds a copy.
-
-.. more durable:
-
-Finally, because blocks are easy to move from system
-to system, we hope that block storage will be more *durable* than files.
-When users own multiple systems, or buy new systems
-to replace old ones, files are often on one harddisk
-and not the other, or moved to a floppy disk but not back
-to the harddisk. How many files you created in the 80s
-do you still keep around on your harddisk today? With block storage,
-each time a user buys a new computer, they could
-transfer all blocks from their existing systems to the new one,
-and blocks from old floppies could be copied to the harddisk
-without thinking about issues like which directory
-to keep them in. By making it easy to collect
-blocks produced on a diverse number of systems, it would be easier
-to keep old data around.
-
-.. persistency commitment:
-
-Of course, to meet this goal it is necessary that the block
-system remains backwards compatible at all times. We have therefore
-decided to enter a *persistency commitment* when we finalize
-the Storm design before the next release of Gzz: Any future version
-of the Storm specification thereafter will be able
-to handle any block created according to this version of the spec.
-This means that no matter how much we'll regret our current choices
-in the future, we commit to providing backward compatibility for them.
-
-.. incompatibility with existing systems:
-
-The advantages we have outlined are bought by an utter incompatibility with
-the dominant paradigms of file names and URLs. We hope that
-it would be possible to port existing applications to use Storm
-without too much effort, but we have not investigated
-the issue closely. This is because Storm was developed
-for the experimental Gzz system, a platform explicitly developed
-to overcome the limitations of traditional file-based applications.
+Storm
+=====
-- versioning, pointers
+.. a general storage system, using Freenet-like identifiers:
+.. part of the web -- URN scheme (so far experimental,
+ targetting registration):
Web integration
===============
-- URN scheme (so far experimental,
- targetting registration)
- HTTP gateway
- Konqueror and Netscape 4 understand Storm URNs
- KDE programs can load from and save to Storm URNs
@@ -222,34 +90,7 @@
Conclusions
===========
-We have introduced the Storm design to address two important issues
-raised by data mobility, dangling links and keeping track
-of alternative versions. In Storm, all data is stored as immutable blocks
-identified a SHA-1 hash. Application-specific indices of these blocks can be
kept.
-
-Storm is not limited to network publishing;
-it can be also used for private document repository. Our present
implementation
-does not support peer-to-peer distribution yet, but the Gzz project has used
it
-for local storage and server-based collaboration for one and a half years.
-Currently, we are working on a GISP-based peer-to-peer
-implementation.
-
-We have written an HTTP gateway and plan integration with KDE.
-
-Work is also needed on user interfaces for Storm.
-
-.. Besides these issues with the backend, we are facing user interface issues
- as well -- for example the conventions for listing, moving and deleting
- blocks. Also conventions for which zones e.g. new blocks should be stored in
- must be resolved. Often they will be private, but when making changes to
- documents that are shared with a project group, the changes should be
- visible to others.
-
-We see Storm as a case study in
-the potentials of a system that does not use
-location-dependent identifiers at all. We hope to raise awareness
-for the prospects of location-independent systems based
-on structured overlay networks such as DHTs.
+..
Acknowledgements
- [Gzz-commits] manuscripts/storm short-paper.rst, Benja Fallenstein, 2003/05/20
- [Gzz-commits] manuscripts/storm short-paper.rst, Benja Fallenstein, 2003/05/20
- [Gzz-commits] manuscripts/storm short-paper.rst, Benja Fallenstein, 2003/05/22
- [Gzz-commits] manuscripts/storm short-paper.rst, Benja Fallenstein, 2003/05/25
- [Gzz-commits] manuscripts/storm short-paper.rst,
Benja Fallenstein <=
- [Gzz-commits] manuscripts/storm short-paper.rst, Benja Fallenstein, 2003/05/26
- [Gzz-commits] manuscripts/storm short-paper.rst, Benja Fallenstein, 2003/05/29
- [Gzz-commits] manuscripts/storm short-paper.rst, Benja Fallenstein, 2003/05/29
- [Gzz-commits] manuscripts/storm short-paper.rst, Benja Fallenstein, 2003/05/29
- [Gzz-commits] manuscripts/storm short-paper.rst, Benja Fallenstein, 2003/05/29
- [Gzz-commits] manuscripts/storm short-paper.rst, Benja Fallenstein, 2003/05/29
- [Gzz-commits] manuscripts/storm short-paper.rst, Benja Fallenstein, 2003/05/29
- [Gzz-commits] manuscripts/storm short-paper.rst, Benja Fallenstein, 2003/05/29
- [Gzz-commits] manuscripts/storm short-paper.rst, Hermanni Hyytiälä, 2003/05/30
- [Gzz-commits] manuscripts/storm short-paper.rst, Tuomas J. Lukka, 2003/05/30