gzz-commits
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Gzz-commits] gzz/Documentation/misc/hemppah-progradu mastert...


From: Hermanni Hyytiälä
Subject: [Gzz-commits] gzz/Documentation/misc/hemppah-progradu mastert...
Date: Mon, 26 May 2003 05:07:05 -0400

CVSROOT:        /cvsroot/gzz
Module name:    gzz
Changes by:     Hermanni Hyytiälä <address@hidden>      03/05/26 05:07:05

Modified files:
        Documentation/misc/hemppah-progradu: masterthesis.tex 

Log message:
        Steven's comments

CVSWeb URLs:
http://savannah.gnu.org/cgi-bin/viewcvs/gzz/gzz/Documentation/misc/hemppah-progradu/masterthesis.tex.diff?tr1=1.204&tr2=1.205&r1=text&r2=text

Patches:
Index: gzz/Documentation/misc/hemppah-progradu/masterthesis.tex
diff -u gzz/Documentation/misc/hemppah-progradu/masterthesis.tex:1.204 
gzz/Documentation/misc/hemppah-progradu/masterthesis.tex:1.205
--- gzz/Documentation/misc/hemppah-progradu/masterthesis.tex:1.204      Wed May 
21 03:01:55 2003
+++ gzz/Documentation/misc/hemppah-progradu/masterthesis.tex    Mon May 26 
05:07:05 2003
@@ -264,7 +264,7 @@
 and KaZaa \cite{kazaaurl} use the FastTrack-based flooding protocol 
\cite{fasttrackurl}. However, it is not clear
 whether the power-law method is scalable or not, 
 as the majority of the query requests are sent only to the high degree peers 
while making 
-these peers to bear the load of the entire system.
+these peers bear the load of the entire system.
 
 %\begin{figure}
 %\centering
@@ -287,7 +287,7 @@
 \label{fig:gnutella_powerlaw}
 \end{figure}    
 
-Above presented improvements are only partial solutions. More advanced 
techniques 
+The improvements presented above are only partial solutions. More advanced 
techniques 
 to improve data lookup of loosely structured systems are discussed in chapter 
3. Yet, however, 
 techniques presented in chapter 3 are not adopted in any loosely structured 
system.
 
@@ -318,24 +318,24 @@
 
 \subsection{Systems}
  
-With tightly structured systems, it is feasible to perform \emph{global} data 
lookups in the overlay efficiently. By global lookup, we mean
+With tightly structured systems, it is feasible to efficiently perform 
\emph{global} data lookups in the overlay. By global lookup, we mean
 that the system is able to find a service from the overlay, if it exists.
-While there are significant differences among proposed tightly structured 
systems, they all have in common
-that \emph{peer identifiers} are assigned to participating peers from
-a large \emph{identifier space} by the overlay. Globally unique identifiers 
-are also assigned to application-specific data items, \emph{keys}, 
+While there are significant differences among proposed tightly structured 
systems, they all have a common property,
+in that \emph{peer identifiers} are assigned to participating peers from
+a large \emph{identifier space} by the overlay. Globally unique identifiers, 
\emph{keys}, 
+are also assigned to application-specific data items 
 which are selected from the same identifier space. For instance, globally 
unique keys can be created
 using a cryptographic content hash function (e.g., SHA-1 \cite{fips-sha-1}) 
over the contents of a data item. 
-The form of identifier space differs between proposed systems. Geometrical 
circular form of identifier space (and variants) 
+The form of identifier space differs between proposed systems. A geometrical 
circular form of identifier space (and variants) 
 is most widely used. For instance, Chord \cite{stoica01chord}, Koorde 
\cite{kaashoek03koorde}, 
 Pastry \cite{rowston01pastry}, SWAN \cite{bonsma02swan}, Tapestry 
\cite{zhao01tapestry}
 and Viceroy \cite{malkhi02viceroy} use a circular form of identifier space of 
$n$-bit integers modulo $2^{n}$. The
 value of $n$ varies among systems. Again, CAN \cite{ratnasamy01can} uses a 
$d$-dimensional geometrical torus 
 model to implement the form of identifier space.
 
-To store data into a tightly structured overlay, each application-specific
+To store data in a tightly structured overlay, each application-specific
 unique key (e.g., SHA-1 \cite{fips-sha-1}) is \emph{mapped} uniformly (e.g., 
using consistent
-hashing \cite{258660}) to an existing peer in the overlay. Thus, tightly 
+hashing \cite{258660}) to an existing peer in the overlay. Thus, a tightly 
 structured overlay assigns a subset of all possible keys to every 
participating peer. 
 We say that a peer is \emph{responsible} for the keys which are assigned by 
the overlay.
 Figure \ref{fig:structured_hashing} illustrates this 
@@ -357,7 +357,7 @@
 distributed data structure which resembles skip lists \cite{78977}.
 In figure \ref{fig:structured_query}, we present an overview of Chord's data 
lookup process.
 On the right side of Chord's lookup process, the same data lookup process
-is shown as a binary-tree abstraction.  It can be noticed, that in each step, 
the distance 
+is shown as a binary-tree abstraction.  It can be seen, that in each step, the 
distance 
 decreases with a logarithmic efficiency.
 
 \begin{figure}
@@ -372,7 +372,7 @@
 \ref{fig:kademlia_lookup} shows the process of Kademlia's
 data lookup. Viceroy \cite{malkhi02viceroy} maintains a butterfly data 
structure (e.g., \cite{226658}), 
 which requires only a constant number of neighbor peers while providing 
$O(\log{n})$ data lookup
-efficiency, where $n$ is the number of peers in the system. Koorde 
\cite{kaashoek03koorde}, a recent modification of Chord, uses de Bruijn graphs 
+efficiency where $n$ is the number of peers in the system. Koorde 
\cite{kaashoek03koorde}, a recent modification of Chord, uses de Bruijn graphs 
 \cite{debruijn46graph} to maintain local routing tables. It requires 
 each peer to have only about two links to other peers to provide $O(\log{n})$ 
performance.
 
@@ -388,7 +388,7 @@
 \cite{zhao03api}. Each of these abstractions represent a storage layer in the 
overlay, but
 have semantical differences in the \emph{usage} of the overlay.
  
-First, Distributed Hash Table (DHT) (see e.g., \cite{dabek01widearea}, 
\cite{rowstron01storage}),  
+First, Distributed Hash Table (DHT) (see e.g., \cite{dabek01widearea}, 
\cite{rowstron01storage})  
 implements the same functionality as a regular hash table by storing the 
mapping between a key and a value:
 
 \begin{itemize}
@@ -398,7 +398,7 @@
 \end{itemize}
 
 DHT's \emph{interface} is generic; values can be any size and type (e.g., 
content hash over a file). In the
-DHT abstraction, the overlay itself stores the data items. Figure 
\ref{fig:Structured_lookup_using_DHT_model} shows the DHT abstraction 
+DHT abstraction the overlay itself stores the data items. Figure 
\ref{fig:Structured_lookup_using_DHT_model} shows the DHT abstraction 
 of the tightly structured overlay. 
  
 Second, Decentralized Object Location (DOLR) (see e.g., 
\cite{kubiatowicz00oceanstore}, \cite{iyer02squirrel}) is a distributed 
@@ -412,11 +412,11 @@
 \end{itemize}
  
 
-The key difference between the DHT and the DOLR abstraction is that in the 
DOLR abstraction the overlay maintains only \emph{pointers} to the data. 
-Also, the DOLR abstraction routes overlay's messages to a nearest available 
peer, hosting a specific data item. This form of locality
+The key difference between the DHT and the DOLR abstraction is that, in the 
DOLR abstraction, the overlay maintains only \emph{pointers} to the data. 
+Also, the DOLR abstraction routes overlay messages to the nearest available 
peer, hosting a specific data item. This form of locality
 is not supported by DHT. DOLR's interface is similar to the DHT's interface, 
i.e., values can be any size and type.
 
-Third, tightly structured overlay can be used for scalable group multicast or 
anycast operations (CAST) (see e.g., \cite{zhuang01bayeux}).
+Third, tightly structured overlays can be used for scalable group multicast or 
anycast operations (CAST) (see e.g., \cite{zhuang01bayeux}).
 The basic operations include:
 
 \begin{itemize}
@@ -426,7 +426,7 @@
 \item \texttt{anycast(message, groupIdentifier)}: anycast a message to a group 
with a given group identifier.
 \end{itemize}
 
-The DOLR and the CAST abstractions have in common that they both use network 
proximity techniques
+The DOLR and CAST abstractions both use network proximity techniques
 to optimize their operations in the overlay. Figure 
\ref{fig:Strucutred_lookup_using_DOLR_model}
 presents the DOLR abstraction. 
 
@@ -456,14 +456,14 @@
 Chord's \cite{stoica01chord} distance function does have the property of 
unidirection
 (for a given point $p_i$ in the identifier space and distance $d$ > 0, there
 is exactly one point $p_j$ in a way that the distance between $p_i$ and $p_j$
-is $d$), but doesn't have symmetry (the distance from $p_i$ to $p_j$ is same 
as the
+is $d$), but does not have symmetry (the distance from $p_i$ to $p_j$ is same 
as the
 distance from $p_j$ to $p_i$). Pastry's \cite{rowston01pastry} distance 
function supports 
-symmetry, but doesn't support unidirection. According to 
\cite{balakrishanarticle03lookupp2p}, because 
+symmetry, but does not support unidirection. According to 
\cite{balakrishanarticle03lookupp2p}, because 
 of XOR-metric, Kademlia's distance function is both unidirectional and 
symmetric. Moreover, Kademlia's \cite{maymounkov02kademlia} 
 XOR-based metric doesn't need stabilization (like in Chord 
\cite{stoica01chord}) and backup links 
 (like in Pastry \cite{rowston01pastry}). 
-However, in all above schemes each hop in the overlay shortens the distance 
between 
-current peer working with the data lookup and the key which was looked up in 
the identifier space.
+However, in all of the above schemes, each hop in the overlay shortens the 
distance between 
+current peer working with the data lookup and the key that was looked up in 
the identifier space.
 
 Skip Graphs \cite{AspnesS2003} and SWAN \cite{bonsma02swan} employ a 
identifier space  
 in which queries are routed to \emph{keys}. In these systems
@@ -484,7 +484,7 @@
 
 Balakrishnan et al. \cite{balakrishanarticle03lookupp2p} have listed four 
requirements
 for tightly structured overlays\footnote{Authors use the term 'DHT' in their 
text, but in this context
-it doesn't matter as they list \emph{general} properties of tightly structured 
overlays.} which have to be addressed in order 
+it doesn't matter as they list \emph{general} properties of tightly structured 
overlays.} that have to be addressed in order 
 to perform efficient data lookups in tightly structured overlays. 
 First, mapping of keys to peers must be done in a load-balanced
 way. Second, the overlay must be able to forward a data lookup for a 
@@ -507,9 +507,9 @@
 Even though the loosely structured and the tightly structured approach are 
both Peer-to-Peer schemes, they 
 have very little in common. Indeed, the only thing they share is the fact that 
no other peer is more
 important than any other in the Peer-to-Peer network. Fault tolerance 
\emph{may}
-be an area, in which approaches have similar properties (e.g., no single point 
of failure) \cite{milojicic02peertopeer}.
+be an area in which approaches have similar properties (e.g., no single point 
of failure) \cite{milojicic02peertopeer}.
 Fault tolerance properties of both approaches are currently only initial 
calculations, or 
-experimented in simulation environments. In real-life, however, measuring 
fault tolerance is much more 
+experimented in simulation environments. In real life, however, measuring 
fault tolerance is a much more 
 challenging task and requires more research to get reliable answers.
  
 The most important differences between approaches are the performance and 
scalability properties. 
@@ -523,8 +523,8 @@
 assume that participating peers are homogeneous, and the rate of join or leave 
operation is low \cite{gurmeet03symphony,
 libennowell01observations, rowston03controlloingreliability}.  
 
-To end user, the biggest difference between these systems is how data lookups 
are performed. Loosely
-structured systems provide more rich and user friendly way of searching data 
than tightly structured systems 
+To the end user, the biggest difference between these systems is how data 
lookups are performed. Loosely
+structured systems provide a more rich and user friendly way of searching data 
than tightly structured systems 
 as they have a support for keyword searches \cite{yang02efficientsearch, 
lv02searchreplication}. Tightly structured 
 systems support only exact key lookups since each data item is identified by 
globally unique keys \cite{balakrishanarticle03lookupp2p, 
 harren02complex, ansaryefficientbroadcast03}.
@@ -612,21 +612,21 @@
 
 Table \ref{table_Peer-to-Peer_algorithms} lists proposed Peer-to-Peer 
algorithms 
 and their key properties with regard to performance and scalability. The list 
-includes algorithms from both loosely and tightly structured approaches. The 
list doesn't 
-include \emph{all} proposed Peer-to-Peer algorithms; only the ones which 
already have 
+includes algorithms from both loosely and tightly structured approaches. The 
list does not 
+include \emph{all} proposed Peer-to-Peer algorithms but rather includes the 
ones which already have 
 been widely deployed, or the ones which may be promising in the future 
-Peer-to-Peer systems are included.
+Peer-to-Peer systems.
  
 We decided to follow the guidelines from \cite{kaashoek03koorde} in measuring
 the properties of different Peer-to-Peer systems. However, we dropped
 out fault tolerance and load balancing properties, since they are hard to 
measure
-in face of real life requirements. Additionally, however, we decided to include
+in real life requirements. Additionally, however, we decided to include
 the number of \emph{real} network connections for each peer in the overlay. 
Next, 
 we describe the listed properties of Peer-to-Peer algorithms:
 
 \begin{itemize}
 \item \textbf{Lookup}: the number of messages required when a data lookup is 
performed.
-\item \textbf{Space}: the number of other peers which peer knows about 
(neighbors).
+\item \textbf{Space}: the number of other peers a peer is aware of (neighbors).
 \item \textbf{Insert/delete}: the number of network messages required when a 
peer joins or leaves the system.
  \item \textbf{Number of network connections}: the number of concurrent 
network connections required to maintain correct neighbor information.
 \end{itemize}




reply via email to

[Prev in Thread] Current Thread [Next in Thread]