gzz-dev
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Gzz] Re: Available overlays PEG


From: Hermanni Hyytiälä
Subject: [Gzz] Re: Available overlays PEG
Date: 01 Aug 2003 11:46:53 +0300

On Thu, 2003-07-31 at 19:07, Benja Fallenstein wrote:
> > ==========================================================================
> > PEG available_overlays--hemppah: Available overlays
> > ==========================================================================
> > 
> > :Authors:  Hermanni Hyytiälä
> > :Date-Created: 2003-06-18
> > :Last-Modified: $Date: 2003/07/31 12:56:11 $
> > :Revision: $Revision: 1.39 $
> > :Status:   Current
> > :Scope:    Major
> 
> (I agree that the scope of this PEG is major, but it doesn't actually 
> fit the definition of "Major:" it's not true that "most code needs a 
> touchup." ``pegboard_format--tjl`` should probably be changed to say 
> something like, "A fundamental change" or something.)


Ok, so I don't touch this for now until there is (probably) a new status
for "fundamental" stuff.


> 
> > :Type:     Architecture
> 
> I don't agree with this: While changing the used overlay would be a big 
> change, it's still an implementation change, not an architectural change.


Agree.

> 
> > :Stakeholders: tjl, benja
> > .. Affect-PEGs:
> 
> Please fix the syntax here (line missing, see the compiled version for 
> error msg). Actually, I think "current" PEGs shouldn't have any 
> commented out fields like this-- the ".. Affect-PEGs" field should 
> simply be removed.


Hm, I didn't have any error messages while compiling this PEG (with
naviloop-mode).

Anyway, I removed the ".. Affect-PEGs" field.


> 
> > The purpose of this PEG document is to give a short overview of existing 
> > structured
> > P2P overlays which have an open source implementation. 
> 
> Not true: At the end, you specify that we shall change to Tapestry. You 
> should say so at this point, already.
> If you intend this PEG to make a no decision, only a recommendation, 
> then you could e.g. say,
> 
>      This PEG gives an overview of existing P2P overlays which have
>      an open source implementation. Based on the overview, it
>      recommends the Tapestry_ overlay to be used in Storm.
> 

Ok.


> (link to review of Tapestry)
> 
> > The reason for creating this PEG document is that the current 
> > implementation 
> > of GISP [5]_ seems to have (too) many obvious security exploits. Therefore 
> > we need 
> > to examine other available structured overlays which have an open source 
> > implementation in hope for finding a more mature P2P platform. 
> 
> s/ in hope for finding/in the hope to find/


Thank you :).


> 
> > The list of implemented overlays is as of July 2003.
> > 
> > Issues
> > ======
> > 
> > None.
> > 
> > Terminology
> > ===========
> > 
> > This section briefly covers the terminology used in this document. 
> > 
> > Abstractions
> >     The following text for the abstraction definitions is taken from 
> >     `Towards a Common API for Structured P2P Overlays`_ by Frank Dabek et 
> > al.
> > 
> > .. _DHT: 
> > 
> >     DHT
> >         The DHT abstraction provides the same functionality as a 
> >         traditional hashtable, by storing the mapping between a
> >         key and a value. This interface implements a simple store
> >         and retrieve functionality, where the value is always stored
> >         at the live overlay node(s) to which the key is mapped by
> >         the KBR layer. Values can be objects of any type. For ex-ample, the 
> > DHT 
> >         implemented as part of the DHash inter-face in CFS stores 
> >         and retrieves single disk blocks by their content-hashed keys.
> 
> s/ex-ample/example/
> s/inter-face/interface/

Uh...it seems that copy&paste from pdf didn't work w.r.t. hyphenation.

> 
> > .. _DOLR:
> > 
> >     DOLR
> >         The DOLR abstraction provides a decentralized direc-tory
> 
> s/direc-tory/directory/

Same here...

> 
> >         service. Each object replica (or endpoint) has an
> >         objectID and may be placed anywhere within the system.
> >         Applications announce the presence of endpoints by pub-lishing
> 
> s/pub-lishing/publishing/

...

> 
> >         their locations. A client message addressed with
> >         a particular objectID will be delivered to a nearby end-point
> 
> s/end-point/endpoint/, I think.

...

> 
> >         with this name. Note that the underlying distributed
> >         directory can be implemented by annotating trees associ-ated
> >         with each objectID; other implementations are pos-sible.
> 
> assoc-iated, pos-sible.

...

> 
> >         One might ask why DOLR is not implemented on
> >         top of a DHT, with data pointers stored as values; this is
> >         not possible because a DOLR routes messages to the near-est
> >         available endpoint providing a locality property not
> >         supported by DHTs. An integral part of this process is the
> >         maintenance of the distributed directory during changes
> >         to the underlying nodes or links.
> 
> near-est

...

> 
> It's not clear to me from the above if and how a DHT can be implemented 
> on top of a DOLR. Since we need a DHT for indexing, this is important. 
> (For downloading blocks, we can use DOLR.)


DHT and DOLR can be seen as "equal" abstractions, i.e., either one
doesn't rely on other: DHT can be implemented without relying DOLR and
vice versa.

(I will put this note in the text)


> 
> > Redundancy
> >     What techniques are used for redundancy     
> > 
> > Fault tolerance against hostile nodes
> >     What techniques are used against hostile nodes    
> 
> These aren't useful definitions. Either provide a definition that's not 
> only the words reordered, or don't provide any definition at all :-)
> 
> As a reader, I don't like to read lots of definitions before the actual 
> content of the document starts, so define only what you really think has 
> to be defined :-)


Hm, perhaps I should remove these defintions then :).


> 
> What do you *mean* by "redundancy," though? Maybe you should use "Fault 
> tolerance against node failures?"


System's redundancy while there are number of node failures in the
system.

But I will remove this "definition" (see above).

> 
> > Implemented overlays
> > ====================
> > 
> > In this section we list the structured P2P overlays which have an open 
> > source
> > implementation. Please notice
> 
> s/notice/note/
> 
> > that features described for each implementation 
> > are deliberately kept as short as possible.
> 
> "that the description of each implementation's features is"
> 
> > For more depth information about the 
> > overlays we suggest reading the original publications.
> 
> s/depth/in-depth/


Thanks again ;).

> 
> > Chord [1]_
> > ----------
> 
> I'd prefer if, instead of using a footnote, you simply gave the homepage 
> of a project as one point below, i.e., "Abstraction, home page, 
> redundancy..."

There *was* a home page link right above the "Abstraction" but I decided
to move it. I will change it back.

> 
> > Abstraction
> >     DHT_/DOLR_
> > 
> > Redundancy
> >     Replication, backup links.
> 
> What is "backup links?"

They can be considered as "extra" neighbor links that are used for
better fault tolerance/redundancy.

> 
> > Fault tolerance against hostile nodes
> >     (undefined)
> 
> What do you mean by "undefined"???


Not known.

(I will change this)

> 
> > Activity of development
> 
> It would be good if you put language and license before this, because 
> they're most important in evaluating the applicability of an overlay.

Good point, I will change this...

> 
> > Developer
> >     MIT (research community)
> 
> What do you mean by "(research community)"? You use this again, below, 
> but you never explain what you mean by it.

The community that is developing a software. The two choices are:

1) "software engineering community", i.e., a regular free software
project.

2) "research community, i.e., a research group that consist one or more
researcher who developes a software


(Again, I will this note include this in the text)


> 
> > License
> >     BSD-like
> 
> `Advertisement clause`__ or not?
> 
> __ http://www.gnu.org/licenses/license-list.html#OriginalBSD


No 

http://pdos.lcs.mit.edu/cgi-bin/cvsweb.cgi/sfsnet/COPYRIGHT?rev=1.1&content-type=text/x-cvsweb-markup


> 
> > Other notes
> >     No support for network locality- do not take network latencies into 
> > account
> >     when building neighbor links.
> 
> s/-/--/
> s/do not/does not/


Uh...thanks...


> 
> How does it implement DOLR if it doesn't support locality?

Locality is not needed for the DOLR abstraction. The nearby object can
be near in terms of hops or network latency (or both); Tapestry and
Pastry just happen to support DOLR with the network latency feature.

> 
> > Tapestry [2]_
> > --------------
> > 
> > Abstraction
> >     DOLR_
> > 
> > Redundancy
> >     According to `Tapestry: A Resilient Global-scale Overlay for Service
> >     Deployment`_:
> > 
> >         Tapestry is highly resilient under dynamic conditions,
> >         providing a near optimal success rate for requests
> >         under high churn rates, and quicly recovering from
> >         massive membership events in under a minute.
> 
> This doesn't list *techniques* (as per your defn above), it says how 
> useful the developers believe their techniques to be. You should 
> probably change the defn (or leave it out).


Well, I removed those "definitions", so...;).

> 
> s/quicly/quickly/, or add "[sic]"
> >     
> >     According to the Tapestry [2]_ website:
> > 
> >         Tapestry offers fault-resilient mechanisms for both object 
> >         location and point to point message delivery.  For object location, 
> >         Tapestry generates a small number of deterministic and independent 
> >         GUIDs for each object.  An object server publishes an object's 
> >         availability on all of its GUIDs, and a client issues Tapestry 
> > locate 
> >         requests for all of the object GUIDs in parallel.  This greatly 
> >         increases availability under fault conditions, while improving 
> >         general performance and reducing performance variability.  For 
> > point 
> >         to point message delivery, Tapestry provides pre-calculated backup 
> >         routes at each overlay hop.  UDP soft-state beacons measure 
> > up-to-date 
> >         link conditions. Tapestry uses such information and simple 
> > fault-avoidance 
> >         protocols to route messages around failures, providing successful 
> >         delivery with high probability if a path between the endpoints 
> > exists.
> 
> I think the stuff about message delivery doesn't apply to us, so you 
> could edit that out.

I think it's good to that Tapestry supports that kind stuff too, e.g.
message delivery can be used with IMs and who knows if Fenfire some day
deploys that kind of functionality.

> 
> > Fault tolerance against hostile nodes
> >     * PKI is used while creating node identifiers
> >     * MACs are used to maintain integrity of overlay traffic
> >     * Monitoring system for maintaining neighbor links
> 
> Explain how this helps against hostile nodes, which is the most 
> important part of this PEG after all.


PKIs: to prevent Sybil attacks
MACs: to maintain integrity of overlay traffic
Monitoring system: reduce packet loss/improve message delivery in the
overlay

(will put this in the text)


> 
> > Activity of development
> ...
> >      According to the Tapestry [2]_ website, Tapestry Version 2.0 contains 
> > the 
> >      following new functionality::
> 
> How is this relevant for "Activity of development"?

It shows how actively new features are included between to releases
(version's 1.0 functionlity is listed above that).

> 
> > Language
> >     Java (Sun JDK 1.3 or a compatible Java Development and Runtime 
> > environment).
> >     The Java interface libraries for the BerkeleyDB database 
> 
> How about Kaffe?

I haven't tested Kaffe after the StreamTokenizer bug was fixed; the
current stable release doesn't work as the bug fix is only available
with the CVS version.

I will test the current CVS version next week.

> 
> Can Tapestry run without JNI?

It *should* since only NBIO (AFAIK), equivalent to java.nio, part of the
Sandstorm package uses JNI and Sandstorm can use java.nio as a
replacement for its own NBIO package. However, I haven't tested it.

I will test running Tapestry/Sandstorm with java.nio next week.


> 
> > Other notes
> >     Support for network locality when building neighbor links.
> > 
> >     Why Oceanstore_ uses Tapestry [2]_? 
> 
> Don't pose rethorical questions like this-- it doesn't look good. Say 
> "Reasons why Oceanstorage uses Tapestry:" instead.
> 
> >     See http://www.oceanstore.org/info/whytapestry.html
> 
> This page is entitled, "Why Do We Need Locality?", though, not "Why do 
> we use Tapestry."

Does it really matter if the link is entitled "why Oceanstore uses
Tapestry?" ? :)

> 
> > Kademlia [3]_
> > -------------
> > 
> > Abstraction
> >     DHT_
> > 
> > Redundancy
> >     No simulation or test results published (not even in the original 
> > publication)
> 
> Replace ")" by ")."

Ok.

> 
> >     In *theory*, however, the "free-choice" feature
> >     gives peers freedom to adapt different conditions. However, the
> >     author of SharkyPy says:
> 
> Link "SharkyPy" to the product, and if possible, link "says" to the 
> source of the quote.


The source of the quote is author's response to my e-mail -- so it's not
possible to link it ;). SharkPy is described a few lines below so I
prefer not to link it.


> 
> >         Kademlia has (in my taste, that's why I decided to drop it) a bad
> >     hole which makes it's use in remote querying for packets pretty useless:
> 
> Don't  indent first line of blockquote.


Ups, this is happenend non-intentionally.


> 
> >     When you have key->value mappings, which have an equal key, in even in
> >     semi-large networks it gets very unprobable that you get all mappings,
> >     the more hosts there are, the less probable it is, and the more mappings
> >     there are, the less probable it is too. I've tried to remedy this in a
> >     later implementation, at the cost of lookup speed, but have never
> >     managed to get all entries when the network had over 100 nodes. (the
> >     later implementation is based on my own server-framework and uses no
> >     threads, btw.) This made it unusable for me, as basically I basically
> >     had to change the lookup-algorithm to query all nodes (back to gnutella,
> >     then...), to get all answers. And that's what is important in the
> >     network I designed it for.
> 
> Wow. That's pretty much a killer for us.
> 
> Any response from the authors?

No since I haven't asked. This was just a private conversation between
the author of SharkyPy and me. I think this issue may be related to the
C++ version as it is "...based on a new abstraction" (a cite from the
author of Kademlia) ;).

> 
> > Activity of development
> >     Java development discontinued, C++ version
> >     is under development.
> 
> Not released, right?

Yes, not yet.

> 
> > GISP [5]_
> > ---------
> > 
> > Abstraction
> >     DHT_/DOLR_
> > 
> > Redundancy
> >     Chord [1]_-like (since GISP uses similar routing tables as Chord)
> 
> If that only boils down to saying, "Replication, backup links," then I 
> think replicating that here is shorter than saying it's Chord-like ;-)

The reason why I put Chord-like is that GISP uses very similar
abstraction of the routing table and routing table is the key property
of the overlay that defines the basic features of whole system.

> > Fault tolerance against hostile nodes
> >     Based on our own initial experiments: the fault tolerance
> >     is relatively weak - no specific techiques used.
> 
> Uh, which experiments?

Which Tuomas and I performed (by checking source code and writing a
draft PEG) (e.g. search for string "[2003-06-18T07:54:03Z]" from the
fenfire log (the discussion is in Finnish though).


> 
> > Other notes
> >     Uses 10x more cache as Chord [1]_ for routing table.
> 
> Why does it? Source?


Yes, I checked the source code of GISP and Chord.


> 
> > MLDonkey [8]_
> > -------------
> > 
> > Abstraction
> >     DHT_ (Kademlia [3]_ algorithm)
> > 
> >     (MLDonkey [8]_ is compatible with Overnet_, and Overnet claims that it 
> > does 
> >     Kademlia [3]_ and multisource downloading)
> 
> As you discuss the two Pastry impls together, you should probably do the 
> same with Overnet.


Open source and  non-open source impls? Ok.


> 
> > Redundancy
> >     Not known (we can imagine that redundancy is relatively high since 
> > MLDonkey
> >     is widely deplyed)
> > 
> > Fault tolerance against hostile nodes
> >     Not known (we can imagine that fault tolerance is relatively high since 
> > MLDonkey
> >     is widely deplyed)
> 
> I don't think that we can just assume that...
> 
> BTW, the author of Overnet was very unhappy about MLDonkey deploying the 
> protocol because they thought it slowed down the network or something, 
> and they called MLDonkey machines "rogue peers." So such "rogue peers" 
> can apparently cause problems, even if they are programmed by a benign 
> (but ignorant) programmer.


Indeed, he was :). 

And I don't blaim him since MLDonkey developers have reverse-engeneered
the ed2k/overnet protocol using the Pandora sniffer
(http://www-sor.inria.fr/projects/relais/pandora/) -- and this may be
the cause why the Overnet network slowed down first place.


> 
> 
> > Language
> >     Objective-Caml (a compiler, not a interpreter)
> 
> s/a compiler, not a interpreter/compiled, not interpreted/
> 
> Objectiv-Caml is a language, not a compiler.


Already fixed that. Thanks anyway ;).


> 
> Other notes: It would be good to know how hard it would be to use 
> MLDonkey's DHT routing mechanism separately (ask programmers?).

Ok...

> 
> > SharkyPy [9]_
> > -------------
> >         This implementation is heavily based on threads, which makes it
> >     pretty resource-intensive on the computer it is running on. I never
> 
> Don't indent first line, please.

Ups...again...(maybe copy&paste or something -- in my editor the text
looks fine).

> 
> > Changes
> > =======
> > 
> > While reviewing the different features of open source implementations of 
> > structured P2P overlays we can conclude that Tapestry seems to be the most
> > mature currently available. Specifically, other implementations
> > lack of features w.r.t. redundancy and fault tolerance that the Tapestry
> > implementation currently supports. Other reasons for choosing Tapestry in 
> > order of 
> > importance:
> > 
> > - The license
> > - The activity of development
> > - The implementation language
> > 
> > As a result, we recommend using Tapestry's open source implementation with 
> > Storm.
> > For detailed changes and information about using Tapestry with Storm, please
> > see the `storm_with_tapestry--hemppah` PEG document.
> 
> s/with Storm/for use in Storm/. Use double backticks for the PEG name.

Fixed...

> 
> The meaning of "For detailed changes please see" isn't clear to me. Does 
> acceptance of this PEG mean that storm_with_tapestry--hemppah is also 
> accepted? Maybe say, "For a detailed list of changes necessary to use 
> Tapestry in Storm, see..."
> 
> ``storm_with_tapestry--hemppah`` isn't an accepted PEG; maybe should 
> refer to it as "pending PEG" or something...

Yes, it's better that way. 

> 
> Link the PEG, don't just give its name.

Ok.

> 
> > .. _`Towards a Common API for Structured P2P Overlays`: 
> > http://www.cs.berkeley.edu/~ravenben/publications/pdf/apis.pdf
> > .. _`Tapestry: A Resilient Global-scale Overlay for Service Deployment`: 
> > http://www.cs.berkeley.edu/~ravenben/publications/pdf/tapestry_jsac.pdf
> 
> You don't need the back-ticks here.

Removed...

> 
> For reading as text, it would be good to have the link URIs closer to 
> the actual links in the document.

I added the homepages right below the the overlay's name, I hope that
helps.



-Hermanni





reply via email to

[Prev in Thread] Current Thread [Next in Thread]