gzz-commits
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Gzz-commits] manuscripts/storm short-paper.rst


From: Benja Fallenstein
Subject: [Gzz-commits] manuscripts/storm short-paper.rst
Date: Thu, 22 May 2003 16:47:48 -0400

CVSROOT:        /cvsroot/gzz
Module name:    manuscripts
Changes by:     Benja Fallenstein <address@hidden>      03/05/22 16:47:48

Modified files:
        storm          : short-paper.rst 

Log message:
        comments in source, structuring the lengthy discussion

CVSWeb URLs:
http://savannah.gnu.org/cgi-bin/viewcvs/gzz/manuscripts/storm/short-paper.rst.diff?tr1=1.3&tr2=1.4&r1=text&r2=text

Patches:
Index: manuscripts/storm/short-paper.rst
diff -u manuscripts/storm/short-paper.rst:1.3 
manuscripts/storm/short-paper.rst:1.4
--- manuscripts/storm/short-paper.rst:1.3       Thu May 22 16:31:14 2003
+++ manuscripts/storm/short-paper.rst   Thu May 22 16:47:48 2003
@@ -72,23 +72,30 @@
 Storm block storage
 ===================
 
+.. as blocks, independent of network location:
+
 In Storm, all data is stored
 as *blocks*, immutable byte sequences identified by a SHA-1 
 cryptographic content hash [fips-sha1]_. 
 Purely a function of a block's content, block ids
 are completely independent of network location.
-Blocks have a similar or somewhat finer granularity
-as regular files, but they are immutable, since any change to the
-byte sequence would change the hash (and thus create a different block).
-Mutable data structures are built on top of the immutable blocks
-(see Section 6).
-
-Storing data in immutable blocks may seem strange at first, but
-has a number of advantages. First of all, it makes identifiers
-self-certifying: no matter where we have downloaded a block from,
-we are able to check we have the correct data by checking
-the cryptographic hash in the identifier. Therefore, we can 
-safely download blocks from an untrusted peer.
+
+.. similar to files, but immutable:
+
+Blocks are similar to files, but they cannot be modified.
+Any change in the data would cause the identifier to change too.
+
+.. identifiers self-certifying:
+
+Storing data in immutable blocks
+has a number of advantages. Firstly, it makes identifiers
+self-certifying. 
+
+After downloading a block, we are can check whether the data
+matches the cryptographic hash in the identifier. 
+Therefore, we can safely download blocks from an untrusted peer.
+
+.. link targets cannot be changed on us:
 
 When we make a reference to a block, we can be sure
 that even the original author of the target will not be able 
@@ -97,10 +104,12 @@
 to the editor this way, the letter's sender won't be able to change 
 the reference into an advertisement for a pornographic web page.
 
+.. caching trivial:
+
 Secondly, caching becomes trivial, since it is
-never necessary to check for new versions of blocks. It is easy
-to replicate data between systems: A replica of a block never
-needs to be updated; cached copies can be kept as long as desired.
+never necessary to check for new versions of blocks.
+
+.. flash crowds alleviated:
 
 If peers make the blocks in their caches available on the network,
 the flash crowd problem could be alleviated: The more users
@@ -111,23 +120,16 @@
 On the other hand, there are privacy 
 concerns with exposing one's cache to the outside world.
 
+.. replication easy:
+
 To replicate all data from computer A
 on computer B, it suffices to copy all blocks from A to B that B
 does not already store. This can be done through a simple 'copy'
 command. Different versions of a single document
 can coexist on the same system without naming conflicts, since
 each version will be stored in its own block with its own id.
-In contrast, a system based on mutable resources
-has to use more advanced schemes, for example merging the changes
-done to a document at A or B. (Merging is still necessary
-when a user wants to incorporate a set of changes, but not
-required at replication time.)
-
-.. On the other hand for instance, several popular 
-   database management systems (e.g. Lotus Notes [ref]) have complex 
-   replication schemes, which may led awkward replication conflicts, 
-   because of they lack the immutable properties of data. 
-   [Or does this belong to diff section ? -Hermanni]
+
+.. web links resolvable to local copies:
 
 The same namespace is used for local data and data
 retrieved from the network. When an online document has been
@@ -137,23 +139,7 @@
 After a block has been downloaded, references to it will *never*
 cease to work, online or offline.
 
-.. tried to think this a bit, as 'never' is always a strong word.
-   can it be that data in the cache is 'out-of-sync', i.e. 
-   there are different versions so that links indeed break for 
-   off-line browsing? e.g. when there is document A linking 
-   to document B, both v1.0, and the user downloads them. then,
-   both are updated on the server, to versions 1.1. user dowloads
-   A, goes off-line, and tries to follow the link to B(1.1?). ? 
-   i know this is not what is meant above, but yet an example of how 
-   evil reviewers might react to the strong notion 'never' there ;o
-   or is it so that the link from A to B is not to a particular version,
-   but to any B, so that in that case the user would get B1.0 and be happy?
-   (or confused, if diff from A,B1.0->1.1 was so major that how A1.1 links to
-   B1.1 does not make any sense when the user ends up in 1.0 instead?)
-   -- antont
-
-.. Changed to read 'block.' That blocks are immutable should be clear
-   at this point ;-) -b
+.. append-only, bugs don't lose old data:
 
 Thirdly, immutable blocks increase *reliability*. 
 When saving a document, an application will only *add* blocks,
@@ -171,22 +157,18 @@
    makes matters a little more complicated; still,
    the basic assertion holds.
 
-Even when a publisher's server fails to serve a block,
-links to it will work until *no* other peer
+.. mirrors trivial:
+
+Links to a block will work as long as *any* peer
 holds a copy. Thus, providing mirrors is trivial.
-Even after failure of all of the publisher's mirrors,
+Even after failure of all dedicated mirrors,
 a document may still be available from peers that have
 downloaded it. An archive of published blocks, in the spirit
 of the Web archive [waybackmachine]_, would only be yet another backup:
 normal links to a block would work as long as the archive
-holds a copy. It would also be hard to purposefully remove
-a published document from the network; whether this is
-a good or a bad property we leave for the reader to judge.
-
-.. XXX When there are problems with network connectivity to a central
-   repository, peers can still work with each other.
-   According to Murphy's law, problems always occur on tight deadlines
-   (our project is empirical proof :-) )
+holds a copy.
+
+.. more durable:
 
 Finally, because blocks are easy to move from system
 to system, we hope that block storage will be more *durable* than files.
@@ -203,6 +185,8 @@
 blocks produced on a diverse number of systems, it would be easier
 to keep old data around.
 
+.. persistency commitment:
+
 Of course, to meet this goal it is necessary that the block
 system remains backwards compatible at all times. We have therefore
 decided to enter a *persistency commitment* when we finalize
@@ -211,6 +195,8 @@
 to handle any block created according to this version of the spec.
 This means that no matter how much we'll regret our current choices
 in the future, we commit to providing backward compatibility for them.
+
+.. incompatibility with existing systems:
 
 The advantages we have outlined are bought by an utter incompatibility with
 the dominant paradigms of file names and URLs. We hope that




reply via email to

[Prev in Thread] Current Thread [Next in Thread]