[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Gzz-commits] gzz/Documentation/misc/hemppah-progradu mastert...
From: |
Hermanni Hyytiälä |
Subject: |
[Gzz-commits] gzz/Documentation/misc/hemppah-progradu mastert... |
Date: |
Wed, 12 Mar 2003 07:57:42 -0500 |
CVSROOT: /cvsroot/gzz
Module name: gzz
Changes by: Hermanni Hyytiälä <address@hidden> 03/03/12 07:57:42
Modified files:
Documentation/misc/hemppah-progradu: masterthesis.tex
progradu.bib
Log message:
Updates, typos, corrections etc.
CVSWeb URLs:
http://savannah.gnu.org/cgi-bin/viewcvs/gzz/gzz/Documentation/misc/hemppah-progradu/masterthesis.tex.diff?tr1=1.130&tr2=1.131&r1=text&r2=text
http://savannah.gnu.org/cgi-bin/viewcvs/gzz/gzz/Documentation/misc/hemppah-progradu/progradu.bib.diff?tr1=1.103&tr2=1.104&r1=text&r2=text
Patches:
Index: gzz/Documentation/misc/hemppah-progradu/masterthesis.tex
diff -u gzz/Documentation/misc/hemppah-progradu/masterthesis.tex:1.130
gzz/Documentation/misc/hemppah-progradu/masterthesis.tex:1.131
--- gzz/Documentation/misc/hemppah-progradu/masterthesis.tex:1.130 Wed Mar
12 05:58:45 2003
+++ gzz/Documentation/misc/hemppah-progradu/masterthesis.tex Wed Mar 12
07:57:42 2003
@@ -19,7 +19,7 @@
%***********************
\title{Fenfire in Peer-to-Peer Environment}
-\translatedtitle{Fenfire ja vertaisverkot}
+\translatedtitle{Fenfire vertaisverkko ympäristössä}
\author{Hermanni Hyytiälä}
@@ -35,14 +35,14 @@
Hermanni Hyytiälä\\
Huhtalammentie 5 as. 17\\
40640 Jyväskylä\\
-sähköposti: address@hidden
+E-mail: address@hidden
\abstract{
In this thesis, we review existing Peer-to-Peer approaches, algorithms and
their
key properties. We summarize open problems in Peer-to-Peer networks and divide
problems into three sub-categories. We observe that there are many
-problems, which have not solutions at all, or problems have proposed
+problems, which have no solutions at all, or problems that have proposed
solutions but they are practically unrealizable.
Then, we give an overview of Fenfire system. We evaluate existing
@@ -51,7 +51,7 @@
related data from Peer-to-Peer network.
}
\tiivistelma{
-Tässä opinnäytetyössä arvioimme olemassaolevia vertaisverkkoja, algoritmeja ja
+Tässä opinnäytetyössä esittelemme olemassaolevia vertaisverkkoja, algoritmeja
ja
niiden erityisominaisuuksia. Teemme yhteenvedon olemassa olevista ongelmista
vertaisverkoissa ja jaamme ongelmat kolmeen alakategoriaan. Havaitsemme, että
on olemassa useita ongelmia, joihin ei ole ratkaisua lainkaan, tai on ehdotelma
@@ -92,7 +92,9 @@
Dave Winer \cite{winer00whatisp2p} lists several
properties of Peer-to-Peer network, while most notably the statement
''The user's machine is a client and a server'' describes best Peer-to-Peer
-systems. To summarize, Peer-to-Peer systems can be characterized as
distributed
+systems. Schollmeier \cite{schollmeier01p2pdefinition} characterizes a
Peer-to-Peer network as a system of
+distributed entities that share their own resources (e.g. CPU time or storage
space).
+To summarize, Peer-to-Peer systems can be characterized as distributed
systems in which all communication is symmetric and all participants have
identical
capabilities and responsibilities. Each \emph{peer} may contribute data or
computing resources (e.g., unused storage) to the overall system and the
welfare
@@ -104,10 +106,10 @@
data lookup and security. In this thesis, we\footnote{Use of the plural is
customary even if research paper is authored solely.}
focus on these aspects in Peer-to-Peer domain.
Specifically, we review existing Peer-to-Peer approaches, algorithms and their
key properties. We observe
-that despite of create amount of proposed Peer-to-Peer systems, all systems
fall either
+that despite of great amount of proposed Peer-to-Peer systems, all systems
fall either to
loosely structured approach or tightly structured approach. We also discuss
open problems in
-Peer-to-Peer systems and divide problems into three sub-categories: security
related problems,
-performance related problems and miscellaneous problems. In the end, we
summarize all
+Peer-to-Peer systems and divide problems into three sub-categories: security
problems,
+performance problems and miscellaneous problems. In the end, we summarize all
problems in easy-to-understand tables.
Next, we give an overview of Fenfire hypermedia system, which implements
xanalogical storage model. We
@@ -130,7 +132,7 @@
is to find the most efficient way to locate and fetch Fenfire related data
from the
Peer-to-Peer network, where Storm scroll block's identifier is given. Second,
we want
to find the most efficient way to locate and fetch most recent Fenfire related
data from the
-the Peer-to-Peer network, which is associated with a given pointer random
string. Third problem
+Peer-to-Peer network, which is associated with a given pointer random string.
Third problem
is otherwise same as the second problem, except we want to locate and fetch
all Fenfire
related data from the Peer-to-Peer network, where given date and/or time range
is given.
@@ -139,7 +141,7 @@
existing Peer-to-Peer approaches, algorithms and key differences. In chapter
3, we
address open problems in Peer-to-Peer domain and divide problems into three
sub-categories. Chapter 4 gives an overview of Fenfire system. In chapter
-5, we evaluate existing Peer-to-Peer approaches with regard to Fenfire system,
propose system
+5, we evaluate existing Peer-to-Peer approaches with regard to Fenfire system,
propose a system
model for Fenfire in Peer-to-Peer environment and present simple algorithms to
perform data
lookups in Peer-to-Peer environment. In addition, we discuss possible problems
of using Fenfire
in Peer-to-Peer environment. In chapter 6, we present conclusions and future
work.
@@ -185,18 +187,18 @@
lookup and maintain security in a varying distributed environment. The most
popular
form of modern Peer-to-Peer computing is file-sharing. In this scenario,
participants
of Peer-to-Peer network share their resources to other participants while
obtaining
-more resources from others. This can been seen as a variant of distributed
file system
+more resources from others. This can be seen as a variant of distributed file
system
(e.g., \cite{levy90distributedfilesystems}).
In a development of modern Peer-to-Peer systems, lot of influences has been
attained from
-other research areas than computer science. There has been done research
regarding
-to ad-hoc nature of complex networks \cite{albert-02-statistical},
\cite{albert-00-tolerance}, \cite{watts00dynamics}.
+other research areas than computer science. Research has been conducted
regarding
+to self-organizing nature of complex networks \cite{albert-02-statistical},
\cite{albert-00-tolerance}, \cite{watts00dynamics}.
It's interesting to realize that chemical properties of cells, the Internet,
ad-hoc
-Peer-to-Peer systems, have all in common that they self-organize based on same
-principles. Furthermore, the association between social connections among
people
+Peer-to-Peer systems, and social networks have all in common that they
self-organize based on same
+principles. Furthermore, the association between social relationships among
people
and Peer-to-Peer overlay topology has been studied recently
\cite{watts00dynamics},
\cite{kleinberg99small}, \cite{nips02-Kleinberg}. This insight is motivated
-by Milgram, how noticed that people are very effective to locate other people
in a wide scale
+by Milgram, who noticed that people are very effective to locate other people
in a wide scale
based on local knowledge. This phenomenon is called as ''small-world
phenomenon''
\cite{milgram67smallworld}. As a consequence, many modern Peer-to-Peer systems
have applied techniques outside of computer science when constructing and
maintaining
@@ -204,24 +206,24 @@
In the end, however, there are two main approaches in which all modern
Peer-to-Peer
systems fall: loosely structured approach and tightly structured approach. In
loosely
-structured approach the construction and the maintenance of the overlay-- as
the name
-suggests- is controlled loosely. This approach gives freedom for participating
peers
+structured approach the construction and the maintenance of the overlay is
controlled
+loosely. This approach gives freedom for participating peers
to perform certain tasks in Peer-to-Peer network. On the other hand, tightly
structured
approach has some rules, which all participating peers have to obey. In the
following
-sections, we discuss in more detail both approaches, they disadvantages and
advantages, and
+sections, we discuss in more detail both approaches, their disadvantages and
advantages, and
key differences.
\section{Centralized}
Napster\footnote{We decided to include Napster in this section only because it
has
-historical value (see previous section).} \cite{napsterurl} was designed to
to allow
+historical value (see previous section).} \cite{napsterurl} was designed to
allow
people to share music. It was a hybrid Peer-to-Peer file-sharing system, i.e.,
the search
index was centralized and the distribution of storage and serving of files was
distributed.
Peers in the Napster network performed requests to the central directory
server to find
-other peers hosting desirable content. Since service requests was totally
based on
+other peers hosting desirable content. Since service requests were totally
based on
centralized index, Napster didn't scale well because of constantly updated
central
-directory, and had a possibility to single point of failure.
+directory, and had a single point of failure.
\section{Loosely structured}
@@ -231,8 +233,10 @@
The construction and maintenance of Gnutella network is extremely ad-hoc,
since participating
peers can form the overlay network based on \emph{local} knowledge. Figure
\ref{fig:gnutella_overlay}
illustrates how peers form an overlay network. Initially, peer 1 creates the
overlay, since
-it's the first participating peer. Then, repeatly new peers join the network
and connects to
-other nodes in a random manner. Thus, Gnutella can be considered as a
variation of \emph{random graph}.
+it's the first participating peer. Then, repeatedly new peers join the network
and connect to
+other nodes in a random manner. Thus, Gnutella can be considered as a
variation of \emph{scale-free
+graph}\footnote{In scale-free graphs (also known as power-law graphs) only a
few peers have high number of neighbor
+links and major of peers have low number of neighbor links.}.
\begin{figure}
\centering
@@ -245,12 +249,12 @@
In Gnutella, each participating peer maintains local index of its own shared
content. Also,
each peer has a few connections to other peer, i.e., peer's \emph{neighbors}.
Basic Gnutella
data lookup works as follows: peer broadcasts a query request to its
neighbors, which in turn
-forwards the query to their neighbors. This leads in the situation where
number of messages
-in the network can grow with $O(n^{2})$, where $n$ is the number of
participating peers in the
+forwards the query to their neighbors. This leads to a situation where number
of messages
+in the network can grow with $O(n^{2})$, where $n$ is the number of
participating peers in
Gnutella network. To limit the amount of network traffic, Gnutella uses
Time-To-Live-limited
-(TTL) flooding to distributed queries. Gnutella uses a Breadth-First-Search
(BFS) with depth limit
-$T$ (e.g., 7), where $T$ is the system-wide maximum TTL of a message in hops.
Therefore, only peers that
-are TTL hops away from the query originator will forward the query or respond
to the query.
+(TTL) flooding to distributed queries. Therefore, Gnutella uses a
Breadth-First-Search (BFS) algorithm
+with depth limit $T$ (e.g., 7), where $T$ is the system-wide maximum TTL of a
message in hops. Thus,
+only peers that are TTL hops away from the query originator will forward the
query or respond to the query.
In Gnutella network, search results are fast, because BFS sends queries to
every possible neighbor. Clearly, this method wastes resources and doesn't
scale well.
Figure \ref{fig:gnutella_query} shows the data lookup process of the Gnutella
network.
@@ -265,7 +269,7 @@
According to \cite{lv02searchreplication}, Gnutella's way to perform data
lookups, \emph{flooding}, has
following limitations. First, choosing the appropriate TTL in practice is not
easy. If the
TTL is too high, query originator may unnecessarily strain the network. If the
TTL is too
-low, the query originator might not find the desired data even it's available
somewhere
+low, the query originator might not find the desired data even if it's
available somewhere
in the network. Second, there are many duplicate messages generated by
flooding, especially
in high connectivity graphs. It is obvious that with these limitations,
flooding creates
significant message processing overhead for each data lookup. Even worse,
flooding may increase
@@ -273,21 +277,20 @@
Lately, there has been done lot of research to improve Gnutella's data lookup
efficiency
and scalability. Adamic et al. \cite{adamic99small},
\cite{adamic02localsearch},
-\cite{adamic01powerlawsearch} has been studied different random walk methods
in power-law
-networks\footnote{In power-law networks only a few peers have high number of
neighbor
-links and major of peers have low number of neighbor links.} and have found
that by
+\cite{adamic01powerlawsearch} has studied different random walk methods in
power-law
+networks and have found that by
instructing peers forwarding data lookups to select high degree peers, the
performance of data lookup
increases significantly. As a result, some of the most recent loosely
structured Peer-to-Peer systems have adopted this method with some
modifications
\cite{gnutella2url}, \cite{shareazaurl}, \cite{fasttrackurl},
\cite{morpheusurl},
-\cite{kazaaurl}, \cite{jxtaurl}, \cite{jxtaoverview},
\cite{botros01jxtasearch},
+\cite{kazaaurl}, \cite{waterhouse02searchp2p}, \cite{botros01jxtasearch},
\cite{ganesan02yappers}.
Figures \ref{fig:gnutella_overlay_supernodes} and
\ref{fig:gnutella_overlay_cluster}
illustrates two possible variations of power-law overlay networks. All the
systems
share the property of that high degree peers maintain index of all other peers
they know about. However, it's not clear whether this algorithm is scalable or
not,
-as majority of the query request are sent only to the high degree peers,
making
-them stress the overhead of nearly entire system.
+as majority of the query requests are sent only to the high degree peers,
making
+them stress the load of entire system.
\begin{figure}
\centering
@@ -318,16 +321,16 @@
all peers $p$ in system. Then, $\forall s \in S$, there is a provider of the
service,
expressed as $p = provider(s)$. Every $p$ has neighbor(s), named as
$neighbor$, which
is $P$ = \{$p \in P: \exists neighbor$, which is randomly chosen from $P$\}.
-\emph{Super peer} is a peer, which hosts the indices of other peers, $sp =
summaryindex(provider(s))$.
-Moreover, $\forall$ regular peer $p$, there is super peer, which has has a
index of regular
-peer's content, specifically $ps$, $P$ = \{$p \in P: \exists ps$,
-where $ps$ = $summaryindex(provider(s)) \wedge (p = provider(s))$\}
+\emph{Super peer} is a peer, which hosts the indices of other peers, $si =
summaryindex(provider(s))$
+and $\forall$ regular peer $p$, there is super peer, which has has a index of
regular
+peer's content, specifically $sp$, $P$ = \{$p \in P: \exists sp$,
+where $sp$ = $provider(summaryindex(provider(s))) \wedge (p = provider(s))$\}
\section{Tightly structured}
Partly due to scalability problems of loosely structured systems, several
tightly
-structured overlays has been proposed.
+structured overlays have been proposed.
This list includes CAN \cite{ratnasamy01can}, Chord \cite{stoica01chord},
Kademlia \cite{maymounkov02kademlia}, Kelips \cite{gupta03kelips},
Koorde \cite{kaashoek03koorde}, ODHDHT \cite{naor03simpledht},
@@ -351,7 +354,7 @@
To store data into tightly structured overlay, each application-specific
unique key (e.g., SHA-1 \cite{fips-sha-1}) is \emph{mapped} uniformly (e.g.,
using consistent
-hashing \cite{258660}) by the overlay to a existing peer in the overlay. Thus,
tightly
+hashing \cite{258660}) by the overlay to an existing peer in the overlay.
Thus, tightly
structured overlay assigns a subset of all possible keys to every
participating peer.
Also, each peer in tightly structured overlay maintains a \emph{routing
table}, which
consists of identifiers and IP addresses of other peers in the overlay.
Entries of routing
@@ -373,7 +376,7 @@
bit-wise exclusive or (XOR) (e.g., Kademlia \cite{maymounkov02kademlia}).
Because of XOR-metric, Kademlia's distance function is both unidirectional
(for a given point $p_i$ in the identifier space and distance $d$ > 0, there
-is exactly one point $p_j$ in way that the distance between $p_i$ and $p_j$
+is exactly one point $p_j$ in a way that the distance between $p_i$ and $p_j$
is $d$) and symmetric (the distance from $p_i$ to $p_j$ is same as the
distance from $p_j$ to $p_i$) \cite{maymounkov02kademlia}. On the other
hand, Chord's \cite{stoica01chord} distance function does have the property
@@ -386,7 +389,7 @@
hop in the overlay shortens the distance between current peer working with the
data lookup
and the key which was looked up in the identifier space.
-Skip Graphs and Swan employ a key space very similar to a tightly structured
+Skip Graphs \cite{AspnesS2003} and SWAN \cite{bonsma02swan} employ a key space
very similar to a tightly structured
overlay, but in which queries are routed to \emph{keys}. In these systems
peer occupies several positions in the identifier space, one for each
application-specific key. The indirection of placing close keys in the
@@ -395,7 +398,7 @@
''resource peer'' in the overlay network for each resource item pair it
publishes.
PeerNet differs from other tightly structured overlays in that it operates
-at the \emph{network} level layer. PeerNet makes an explicit distinction
+at the \emph{network} layer. PeerNet makes an explicit distinction
between peer identity and address, which is not supported by standard
TCP/IP-protocols. Otherwise, PeerNet has the same performance properties
as other tightly structured overlays, i.e., $O(\log{n})$ space required
@@ -416,19 +419,19 @@
differences in the data structure that they use as a routing table. For
example, Chord
\cite{stoica01chord}, Skip graphs \cite{AspnesS2003} and Skipnet
\cite{harvey03skipnet2} maintain a local
data structure which resembles Skip lists \cite{78977}.
-In figure \ref{fig:structured_query}, we present overview of Chord's lookup
process.
-On the left side of Chord's lookup process, we show the same data lookup
process
-as binary-tree abstraction. We can notice, that in each step, the distance
between
-logarithmic efficiency.
+In figure \ref{fig:structured_query}, we present an overview of Chord's lookup
process.
+On the right side of Chord's lookup process, the same data lookup process
+is shown as a binary-tree abstraction. It can be noticed, that in each step,
the distance
+decreases between with a logarithmic efficiency.
Kademlia \cite{maymounkov02kademlia}, Pastry \cite{rowston01pastry} and
Tapestry
\cite{zhao01tapestry} uses balanced $k$-trees as routing table's data
structure. Figure
-\ref{fig:kademlia_lookup} shows the process of Kademlia
+\ref{fig:kademlia_lookup} shows the process of Kademlia's
data lookup. Viceroy \cite{malkhi02viceroy} maintains a butterfly data
structure (e.g., \cite{226658}),
which requires only constant number of neighbor peers while providing
$O(\log{n})$ data lookup
efficiency. Koorde \cite{kaashoek03koorde}, recent modification of Chord, uses
de Bruijn graphs
-\cite{debruijn46graph} to maintain local routing tables. Koorde requires each
peer to have only
-about two links to other peers to to provide $O(\log{n})$ performance.
+\cite{debruijn46graph} to maintain local routing tables. Koorde
\cite{kaashoek03koorde} requires
+each peer to have only about two links to other peers to provide $O(\log{n})$
performance.
\begin{figure}
\centering
@@ -500,11 +503,11 @@
is defined as $i = identifier(s)$. Metric space is defined as a pair $(IS,d)$,
where $d$
is the distance between two coordinate points $ip_i$, $ip_j$ in $IS$ space.
Mapping
function is defined as $map: I \longmapsto IS$, and coordinate point as
-$ip = map(identifier(s))$, which maps data items, expressed by a identifier to
coordinate
+$ip = map(identifier(s))$, which maps data items, expressed by an identifier
to coordinate
point $ip$ in $(IS,d)$. Peer's p resources are mapped onto a set $IS$ = \{$ip
\in IS:
\exists s \in S$, $ip = map(identifier(s)) \wedge (provider(s) = p)$\}.
Every $p$ has neighbor(s), named as $neighbor$, which are $P$ = \{$p \in P:
\exists neighbor$,
-where $difference(p,p_neighbor)= ''close''$, where $''close''$ is small
difference $d$ in $(IS,d)$\}.
+where $difference(p,neighbor) = ''close''$, where $''close''$ is small
difference $d$ in $(IS,d)$\}.
\section{Summary}
@@ -515,19 +518,19 @@
\subsection{Differences}
-Even loosely structured and tightly structured approach are both Peer-to-Peer
schemes, they
+Even though loosely structured and tightly structured approach are both
Peer-to-Peer schemes, they
have very little in common. Indeed, the only thing they share is the fact that
no other peer is more
-important than other peer in the Peer-to-Peer network. Fault tolerance
\emph{may} may
-be an area, in which approaches have similar properties (e.g., single point of
failure).
+important than an other in the Peer-to-Peer network. Fault tolerance \emph{may}
+be an area, in which approaches have similar properties (e.g., no single point
of failure).
However, fault-tolerance properties of both approaches are currently only
initial calculations, or
experimented in simulation environments. In real-life, measuring fault
tolerance is much more
challenging task and requires more research to get reliable answers.
-Thus, there are significant differences between loosely structured and tightly
structured approaches.
-The most important aspect is the performance and scalability. While
performance of loosely structured approach
-is not always even linear, generally tightly structured approach can perform
all internal operations in
-poly-logarithmic time\footnote{However, it is unknown whether all proposed
algorithms can preserve
-logarithmic properties in real-life applications or not.}.
+There are significant differences between loosely structured and tightly
structured approaches.
+The most important difference between approaches is performance and
scalability properties. While
+performance of loosely structured approach is not always even linear,
generally tightly structured
+approach can perform all internal operations in poly-logarithmic
time\footnote{However, it is unknown
+whether all proposed algorithms can preserve logarithmic properties in
real-life applications or not.}.
Another key point is the philosophy how overlay network is constructed and
maintained. While loosely
structured approach gives much freedom to individual peers to join and leave
the overlay network, tightly
@@ -535,12 +538,12 @@
(such as mapping of data items).
To end user, biggest difference between these systems is how data lookups are
performed. Loosely
-structured systems provide much more richer and user friendly way of searching
data as they
-have support for keyword search and fuzzy search. On the other hand, tightly
structured systems support
+structured systems provide more richer and user friendly way of searching data
as they
+have support for keyword search. On the other hand, tightly structured systems
support
only exact key lookups as each data item is identified by globally unique keys.
-In the end, both systems have open problems and issues. We will discuss these
aspects more detail in
-chapter 3. Table \ref{table_comparison_approach} lists key differences between
loosely structured
+In the end, both systems have open problems and issues. We will discuss these
aspects in more detail in
+chapter 3. Table \ref{table_comparison_approach} lists the key differences
between loosely structured
approach and tightly structured approach.
@@ -653,7 +656,7 @@
listed above belongs to tightly structured approach since there has been
active
research being pursued towards tightly structured approach lately. List
doesn't
include \emph{all} proposed Peer-to-Peer algorithms. Only the ones which
already have
-been widely deployed in real life, or the ones which may promising in the
future's
+been widely deployed in real life, or the ones which may be promising in the
future
Peer-to-Peer systems are included in this thesis.
We decided to follow the guidelines from \cite{kaashoek03koorde} when
@@ -728,7 +731,7 @@
\parbox{37pt}{$O(1)$} &
\parbox{37pt}{$O(1)$} &
\parbox{37pt}{$O(n)$} &
-\parbox{85pt}{Typical configuration is 5 connections (2*5=10 total), however,
depends on implementation} &
+\parbox{85pt}{Typical configuration is 5 connections (2*5=10 total), however,
this depends on implementation} &
\parbox{85pt}{Number of messages can grow as fast as $O(n^{2})$}
\\ \hline
@@ -738,7 +741,7 @@
\parbox{37pt}{$O(\log{n})$} &
\parbox{37pt}{$O(\log{n})$} &
\parbox{85pt}{$2(\log{n})$} &
-\parbox{85pt}{There is no action required when nodes leaves the system}
+\parbox{85pt}{There is no action required when peers leave the system}
\\ \hline
@@ -747,7 +750,7 @@
\parbox{37pt}{$O$($\sqrt{n}$)} &
\parbox{37pt}{$O(1)$} &
\parbox{85pt}{$\frac{n}{\sqrt{n}} + c*(\sqrt{n}-1) + \frac{Totalnumber of
files}{\sqrt{n}}$, where n is the number of nodes and c the number of
contacts/foreign affinity group} &
-\parbox{85pt}{Insert/delete overhead is constant and performed background, the
performance of system may decrease if nodes are not homogeneous and nodes join
and leave the system in a dynamic manner}
+\parbox{85pt}{Insert/delete overhead is constant and performed in the
background, the performance of system may decrease if nodes are not homogeneous
and nodes join and leave the system in a dynamic manner}
\\ \hline
\parbox{37pt}{Koorde \cite{kaashoek03koorde}} &
@@ -755,7 +758,7 @@
\parbox{37pt}{$O(1)$ or $O(\log{n})$} &
\parbox{37pt}{$O(\log{n})$ or $O(\frac{\log{n}}{\log{}\log{n}})$} &
\parbox{85pt}{$2(\log{n})$} &
-\parbox{85pt}{Based on Chord algorithm, uses de Bruijn graphs for better
efficiency/fault-tolerance}
+\parbox{85pt}{Based on Chord algorithm, uses de Bruijn graphs for better
efficiency and fault tolerance}
\\ \hline
\parbox{37pt}{ODHDHT \cite{naor03simpledht}} &
@@ -763,7 +766,7 @@
\parbox{37pt}{$O(\log{n})$} &
\parbox{37pt}{$O(\log{n})$} &
\parbox{85pt}{$2(\log{n})$} &
-\parbox{85pt}{There are two lookup algorithms. The other is $O(\log{n})$,
which is robust under random deletion. The second is $O(\log^2{n})$, which is
also robust under spam generating model}
+\parbox{85pt}{There are two lookup algorithms. The other is $O(\log{n})$,
which is robust under random deletion. The second is $O(\log^2{n})$, which is
also robust under spam generating model.}
\\ \hline
@@ -797,7 +800,7 @@
\parbox{37pt}{$O(\log{n})$} &
\parbox{37pt}{$O(\log{n})$} &
\parbox{85pt}{$4r(\log{n}) + (\log{n})$, where r=number of resources
provided)} &
-\parbox{85pt}{In this approach, node is treated as 'named resource'; in this
approach, \emph{resources} self-organize (opposite to DHTs)}
+\parbox{85pt}{In this approach, node is treated as ''named resource''}
\\ \hline
\parbox{37pt}{SkipNet \cite{harvey03skipnet2}} &
@@ -813,7 +816,7 @@
\parbox{37pt}{$O(1)$} &
\parbox{37pt}{$O(n)$} &
\parbox{85pt}{Can be 1-10000 connections (aka social connections, connections
are permanent)} &
-\parbox{85pt}{Connection number depends on node's memory/network capabilities}
+\parbox{85pt}{Number of connections number depends on node's memory/network
capabilities}
\\ \hline
\parbox{37pt}{Symphony \cite{gurmeet03symphony}} &
@@ -821,7 +824,7 @@
\parbox{37pt}{$O(\log{n})$} &
\parbox{37pt}{$O(\log{n})$} &
\parbox{85pt}{$2k+2+f$, where k = long range connections, 2 = node's
neighbors, f = fault-tolerance connections)} &
-\parbox{85pt}{Space can be also $O(1)$. Additional space of $space^2$ can be
used as a lookahead list for better performance, not necessarily fault-tolerant
because of constant degree of neighbors}
+\parbox{85pt}{Space can be also $O(1)$. Additional space of can be used as a
lookahead list for better performance, not necessarily fault-tolerant because
of constant degree of neighbors}
\\ \hline
\parbox{37pt}{SWAN \cite{bonsma02swan}} &
@@ -829,7 +832,7 @@
\parbox{37pt}{$O(1)$} &
\parbox{37pt}{$O(\log^2{n})$} &
\parbox{85pt}{$r(2b+2s+2l)$ (where r=number of resources provided, b=boot
connections, s=short range connections, l=long range connections), typical
connection configuration: 2*(6+7+8)=36} &
-\parbox{85pt}{In this approach, node is treated as 'named resource'; in this
approach, \emph{resources} self-organize (opposite to DHTs)}
+\parbox{85pt}{In this approach, node is treated as ''named resource''}
\\ \hline
@@ -838,7 +841,7 @@
\parbox{37pt}{$O(\log{n})$} &
\parbox{37pt}{$O(\log{n})$} &
\parbox{85pt}{$(2^{b - 1})\frac{\log{n}}{b}$, where $b$ is a configurable
parameter for tuning digit-fixing properties (routing table)} &
-\parbox{85pt}{The performance of system performance may decrease if nodes are
not homogeneous and nodes join and leave the system in a dynamic manner, based
on Plaxton's algorithm}
+\parbox{85pt}{The system performance may decrease if nodes are not homogeneous
and nodes join and leave the system in a dynamic manner, based on Plaxton's
algorithm}
\\ \hline
\parbox{37pt}{Viceroy \cite{malkhi02viceroy}} &
@@ -846,7 +849,7 @@
\parbox{37pt}{$O(1)$} &
\parbox{37pt}{$O(\log{n})$} &
\parbox{85pt}{11} &
-\parbox{85pt}{The performance of system performance may decrease if nodes are
not homogeneous and nodes join and leave the system in a dynamic manner, not
necessarily fault-tolerant because of constant degree of neighbors}
+\parbox{85pt}{The system performance may decrease if nodes are not homogeneous
and nodes join and leave the system in a dynamic manner, not necessarily
fault-tolerant because of constant degree of neighbors}
\\ \hline
@@ -861,12 +864,12 @@
\chapter{Open Problems in Peer-to-Peer}
-In this chapter, we discuss open problems in Peer-to-Peer domain. We describe
+In this chapter, we discuss open problems in Peer-to-Peer research. We describe
open problems and their proposed solutions. Then, we list all issues in
tables; we list description of the problem, solution and comments on that
specific open problem. Note that open problems list considered here is not
meant
to be an exhaustive survey of \emph{all} open problems in Peer-to-Peer domain;
-we focus our attention to security, scalability, usability and performance
related issues
+we focus our attention to security, scalability, usability and performance
issues
only.
\section{Overview}
@@ -877,20 +880,20 @@
systems may no longer apply with Peer-to-Peer systems. Therefore, new
solutions are
needed to make Peer-to-Peer systems more secure and efficient.
-Both loosely structured and tightly structured approach have their own main
problems.
-Since Napster \cite{napsterurl} and Gnutella \cite{gnutellaurl} was first time
introduced
-to public, researchers' main concern has been scalability problem of loosely
structured
+Both loosely structured and tightly structured approach have their own
specific problems.
+Since Napster \cite{napsterurl} and Gnutella \cite{gnutellaurl} was first
introduced
+to public, researchers' main concern has been the scalability problem of
loosely structured
approach. However, people often misunderstand the scalability problem of
loosely structured
approach; \emph{network} of loosely structured systems is scalable, but the
\emph{data lookup model} is not.
The main concern of tightly structured system is to make overlay's data lookup
-routing more flexible against hostile attacks. Another key problems in tightly
structured
+routing more flexible against hostile attacks. Other key problems in tightly
structured
systems are the lack of keyword searches, support for heterogeneous peers and
load balancing
\cite{balakrishanarticle03lookupp2p}.
To make Peer-to-Peer systems even more popular (e.g., in industry),
Peer-to-Peer domain
needs better infrastructures to deal with security issues. There has been done
some
research regarding anonymity, access control, data availability and data
integrity but as
-we state in the following sections, much more research work is required to
solve security related issues.
+we state in the following sections, much more research work is required to
solve security issues.
\section{Security problems in Peer-to-Peer}
@@ -905,7 +908,7 @@
In Sybil attack model, hostile entity presents multiple
entities. Therefore, one hostile entity can control a large fraction of the
Peer-to-Peer system. Optimal
possible solution to Sybil attack would be that system could \emph{distinct}
entities of the system reliably. Unfortunately,
-currently there no realizable techniques for this task. Partial solutions for
Sybil attack is to replicate
+currently there are no realizable techniques for this task. Partial solutions
for Sybil attack is to replicate
and fragment data randomly among several participating peer. However, both
suggestions assume that two different
remote entities are actually different; Sybil attacks are still possible and
therefore, would need centralized
authority for reliable authentication. As author argues in
\cite{douceur02sybil}, without centralized authority,
@@ -914,22 +917,22 @@
In random fail-stop model, cited in \cite{naor03simpledht}, faulty peer is
deleted from the Peer-to-Peer system.
The reason for faultiness of peer can be a software failure, a hostile attack,
or external threat such as virus or
-Trojan. Closely related to fail-stop model is the Byzantine attack model
-\cite{357176}. Byzantine model can been seen more severe than fail-stop model
as there are no restrictions over
+trojan. Closely related to fail-stop model is the Byzantine attack model
+\cite{357176}. Byzantine model can be seen as more severe than fail-stop model
as there are no restrictions over
the behavior of faulty peers. Practical, but partial solution for Byzantine
failures has been proposed by Castro et
-al \cite{296824}.
+al. \cite{296824}.
Spam generating attack is another known attack model against Peer-to-Peer
system. In Spam
attack, hostile or faulty peer may produce false information of the data, or
refuses/is not able to reply to requests.
Possible solution against this attack is that peer should not trust to single
entity. Instead, peer should get
-information from multiple entities and trust on majority's opinion. This
methods requires more messages to be
-sent to network while increasing the load of system. However, if Spam attack
is combined with Sybil attack, obviously
+information from multiple entities and trust on majority's opinion. This
method requires more messages to be
+sent to network while increasing system load. However, if Spam attack is
combined with Sybil attack, obviously
previously mentioned solution doesn't work. Again, more research is required
to solve this attack model
safely. Naor et al. \cite{naor03simpledht} has proposed a partial solution
against Spam attack with
\emph{faulty} peers (not hostile).
-Traditional overload of targeted peers is best known form of distributed
Denial of Service attack (DDoS). For example,
-hostile entity can attempt to burden targeted peers with garbage network
packets. As a implication, peers may act
+Traditional overloading of targeted peers is best known form of distributed
Denial of Service attack (DDoS). For example,
+hostile entity can attempt to burden targeted peers with garbage network
packets. As an implication, peers may act
incorrectly or stop working. DDoS attack may be very severe, especially if
rate of replication and caching
in Peer-to-Peer system is low. This may lead to data loss in the Peer-to-Peer
system. Daswani et al.
\cite{daswani02queryflooddos} has done research regarding to this subject.
Authors suggest efficient load balancing
@@ -947,27 +950,28 @@
Trust in Peer-to-Peer systems is based on \emph{reputation}. Proposed
reputation methods focus either
on the semantic properties, or data management properties of the trust model.
Some research has been
done on reputation models in Peer-to-Peer systems, such as
\cite{aberer01trust}, \cite{cornelli02reputableservents}.
-Implementations include Advogato \cite{advogatourl}. None of the current
proposals or implementations
+One implementation include Advogato \cite{advogatourl}. None of the current
proposals or implementations
based on reputation address trust in a reliable way.
Optimal solution for trust in Peer-to-Peer systems would be certificate based
security models.
Quite recently, widely used Public Key Infrastructure (PKI) has been deployed
in distributed
-systems \cite{rivest96sdsi}, \cite{spkiworkinggroup}. PKI is an reliable
technology for securing
+systems \cite{rivest96sdsi}, \cite{spkiworkinggroup}. PKI is reliable
technology for securing
data in rather \emph{static} computing systems, such as in the Internet.
However, in Peer-to-Peer
network, the problem of key based security mechanism is the maintenance of the
keys as participating
-peer constantly join and leave the system. Specifically, the distribution of
key changes comes an essential
+peers constantly join and leave the system. Specifically, the distribution of
key changes comes an essential
problem in ad hoc environments. These include revocation of keys and new key
distribution in hostile
environment.
ConChord \cite{ajmani02conchord} is the first Peer-to-Peer system which has a
support for PKI based
-security infrastructure. Unfortunately, ConChord is in early in development
and lacks of important
-features of PKI to be fully usable yet. Furthermore, the hierarchy of
SDSI/SPKI \cite{rivest96sdsi},
-\cite{spkiworkinggroup} may be a problem for Peer-to-Peer systems, in which
hierarchy is intentionally missing.
+security infrastructure. Still, however, ConChord \cite{ajmani02conchord} is
in early phase of development and lacks of
+important features of PKI to be fully usable yet. Furthermore, the hierarchy
of Simple Distributed Security Infrastructure
+(SDSI) \cite{rivest96sdsi} and Simple Public Key Infrastructure (SPKI)
\cite{spkiworkinggroup} may be a problem for
+Peer-to-Peer systems, in which hierarchy is intentionally missing.
For data integrity, on the other hand, there are few working solutions.
Cryptographic content hashes
-\cite{fips-sha-1}, variations \cite{merkle87hashtree} and their implementation
techniques \cite{mohr02thex},
-are efficient and reliable method for identifying the integrity of data in
Peer-to-Peer systems. One
-possible application of cryptographic content hashes may in peer identifier
creation process, in which
+\cite{fips-sha-1}, their variations \cite{merkle87hashtree} and implementation
techniques \cite{mohr02thex},
+are efficient and reliable methods for identifying the integrity of data in
Peer-to-Peer systems. One
+possible application of cryptographic content hashes may be in peer identifier
creation process, in which
IP address of peer can be verified by the other peer. This is one form of
\emph{self-certifying data}.
@@ -976,12 +980,12 @@
According to \cite{dingledine00free}, there exists several kinds of anonymity.
Author-anonymity is a form
of anonymity in which no one can link author to a specific document. In
publisher-anonymity system,
no one is able to link publisher to a specific document. Reader-anonymity
means that a specific
-document cannot be linked to document's readers. This form of anonymity
protects the privacy of a
+document cannot be linked to document's readers. This form of anonymity
protects the privacy of
users of the system. Furthermore, peer-anonymity means that no peer can be
linked to a specific document, i.e.,
-no one is able to determine the peer, in where document was originally
published. Document-anonymity
+no one is able to determine the peer, where the document was originally
published. Document-anonymity
means that peer doesn't know which data it is currently hosting. Finally,
query-anonymity is a form
of document-anonymity; when other peers performs data lookups, peer doesn't
know which data it serves
-to the data lookup originators. As the authors cite, some of forms of
anonymity may imply each other and
+to the data lookup originators. As the authors cite, some forms of anonymity
may imply each other and
possible issues raised by this property is one area of future work.
With regard to anonymity in Peer-to-Peer systems, there has been done much
research work both at network
@@ -990,9 +994,9 @@
Obviously, providing several types of anonymity, it often conflicts with other
key properties of
Peer-to-Peer system. Let's consider anonymity and efficient data lookup. In
efficient data lookup, we must know
-the peers responsible to given data in Peer-to-Peer system. Of course, when we
know the peers responsible
+the peers responsible for given data in Peer-to-Peer system. Of course, when
we know the peers responsible
for the data, the anonymity of peer is lost. Fortunately, there are partial
solutions to previously
-mentioned situations, such as \emph{pseudonym} which is a partial form of
anonymity. For instance, pseudonym can used for
+mentioned situations, such as \emph{pseudonym} which is a partial form of
anonymity. For instance, pseudonym can be used for
addressing peer-anonymity by providing anonymous-like identifiers to peers
(e.g., peer identifiers of tightly
structured system).
@@ -1001,12 +1005,12 @@
Freenet \cite{clarke00freenet}, Publius \cite{pub00}, Free haven
\cite{dingledine00free}, Crowds \cite{reiter98crowds},
Tangler \cite{502002} and upcoming Mnet \cite{mneturl}. Forwarding proxies are
used in Freenet, Crowds and
Free Haven in order to provide various types of anonymity. Tangler and Publius
uses cryptographic
-sharing methods to split a data into data fragments \cite{Shamir1979a}. Mix
mailer networks, such as
+sharing methods to split data into fragments \cite{Shamir1979a}. Mix mailer
networks, such as
\cite{mixminionurl}, are commonly used in distributed systems, which are able
to provide some level
-of anonymity
+of anonymity.
Even if many existing Peer-to-Peer systems are able to provide some of the
types of anonymity, there is no
-such a system which is able to provide all kinds of anonymity as listed above.
Specifically, the conflicts
+such system which is able to provide all kinds of anonymity as listed above.
Specifically, the conflicts
between anonymity and other Peer-to-Peer system properties requires more
research work.
@@ -1014,13 +1018,14 @@
Any distributed computing system must support different levels of access
control. For instance, in Peer-to-Peer
system, we may want to restrict the accessibility of data to only limited
amount of participating peers. Yet, Peer-to-Peer
-systems doesn't have working and distributed access control scheme. Moreover,
+systems do not have working and distributed access control scheme. Moreover,
there has been a lot of violation of copyright laws by users of Peer-to-Peer
file sharing systems. As a
-consequence, some lawsuits has been created against the companies how have
build popular file-sharing programs.
+consequence, some law suits have been created against the companies who have
build popular file-sharing programs.
To our knowledge, Nejdl et al. \cite{nejdl03accesscontrol} have proposed very
recently first practical solution to access
-control problem in Peer-to-Peer systems. They use RDF-based schema policies to
restrict access to certain
-data. Unfortunately, their current prototype works only in loosely structured
systems.
+control problem in Peer-to-Peer systems. They use Resource Description
Framework (RDF) \cite{w3rdfurl} based
+schema policies to restrict access to certain data. Unfortunately, their
current prototype works only in
+loosely structured systems.
\subsection{Hostile entities}
@@ -1036,21 +1041,22 @@
Of course centralized authorities could be used for assignment of peer
identifiers, but they have
property of single point of failure. Furthermore, distributed peer
identification assignment can
be problematic as long as Sybil attack \cite{douceur02sybil} remains unsolved.
However, there are some partial solutions
-for controlling the rate at which and hostile entity is able to obtain peer
identifier, such as crypto-based
+for controlling the rate at which hostile entity is able to obtain peer
identifier, such as crypto-based
puzzles \cite{juels99clientpuzzles}.
-In the end, none of previously mentioned solutions are able to identify
hostile entities in practical,
+In the end, none of the previously mentioned solutions are able to identify
hostile entities in practical,
efficient way. More research is required to solve these problems.
\subsection{Secure query routing}
Much work has been done on secure routing, especially related to tightly
structured systems. In
-\cite{castro02securitystructured} and \cite{castro02securerouting}, authors
suggests the usage
+\cite{castro02securitystructured} and \cite{castro02securerouting}, authors
suggest the usage
of constrained routing tables and diverse routes, and detection of faults
during query routing.
-Additionally, authors present a important aspect of tightly structured
approach with regard
+Additionally, authors present an important aspect of tightly structured
approach with regard
to fault-tolerant query routing: the probability of routing successfully
between to arbitrary,
-correct peers, when a fraction $f$ of the other peers are faulty or hostile,
is only $(1-f)^{h-1}$.
+correct peers, when a fraction $f$ of the other peers are faulty or hostile,
is only $(1-f)^{h-1}$, where
+$h$ is the number of hops in the overlay.
Sit and Morris \cite{sit02securitycons} discuss the possibility of allowing
query originator
to observe lookup progress and cross-check routing tables using random
queries. However, their
@@ -1058,7 +1064,7 @@
in function.
Additionally, Lynch et al. \cite{lynch02atomicdataaccess} propose a solution
to secure routing table
-maintenance, but their solution seems to have to major problems
\cite{castro02securitystructured}. First,
+maintenance, but their solution seems to have two major problems
\cite{castro02securitystructured}. First,
the solution is very expensive even without faulty or hostile entities.
Second, each group of replicas
in their solution must have less than 1/3 of its peer faulty. Thus, this
feature results in a low
probability of successful routing.
@@ -1070,7 +1076,7 @@
Fiat et al. in \cite{fiat02censorship},
\cite{saia02dynamicfaultcontentnetwork} and Datar in \cite{datar02butterflies}
describe tightly structured overlay with analytical results in the presence of
hostile entities. However,
-none of these proposals doesn't address an efficient, dynamic tightly
structured overlay and multiple rounds
+none of these proposals address an efficient, dynamic tightly structured
overlay and multiple rounds
of hostile attack. Also, above mentioned proposals are not very efficient. In
\cite{fiat02censorship}, each node
must maintain information of $O(\log^3{n})$ other peers, and in
\cite{datar02butterflies}, $O(\log^2{n})$ is required.
@@ -1081,7 +1087,7 @@
\subsection{Other security threats}
-Ross Lee graham lists several external threats against Peer-to-Peer networks
\cite{grahamp2psecurity}. Most important,
+Ross Lee Graham lists several external threats against Peer-to-Peer networks
\cite{grahamp2psecurity}. Most important,
the list includes viruses and Trojan. Currently, there are not even partial
solutions
to the problems mentioned above. General robustness properties of Peer-to-Peer
system is able to
deal with software failures and hostile attack, but fault tolerance against
external threats is unknown.
@@ -1113,10 +1119,10 @@
may not be fast when desired data item requires many consecutive flooding
rounds.
Directed BFS \cite{yang02improvingsearch} optimizes the original
-BFS in way that peer selects neighbors with many quality results
-may be reached, thereby maintaining the the quality of costs and decreasing
the amount
+BFS in a way that peer selects neighbors with many quality results are reached
earlier,
+thereby maintaining the quality of costs and decreasing the amount
of messages sent to network. Alpine \cite{alpineurl} and NeuroGrid
\cite{joseph02neurogrid}
-are Peer-to-Peer system which use somewhat similar method when performing data
lookups.
+are Peer-to-Peer systems which use somewhat similar method when performing
data lookups.
Local indices \cite{yang02improvingsearch} is one variation of active caching.
In this scheme, each peer maintains an index over the data of all nodes within
@@ -1129,14 +1135,14 @@
randomly selected neighbor. The basic random walk approach decreases the
overhead generated by messages. On the other hand, basic random walk approach
has poor response time. As suggested in \cite{lv02searchreplication},
-random walk approach can be done more effective by introducing
-multiple ''walkers''. Freenet \cite{clarke00freenet} Peer-to-Peer system uses
+random walk approach can be made more effective by introducing
+multiple ''walkers''. Freenet \cite{clarke00freenet} uses
random walk searches in query lookups. Indeed, Freenet's query resembles
Depth-First-Search (DFS) and peers' routing tables are dynamically built
using caching. This is an outcome of Freenet's main design principles,
i.e., anonymity. Another property of Freenet's data lookup model is that
it adapts well with varying usage patterns. Improvements to Freenet's data
lookup using
-''small-world phenomenon'' has been proposed by Zhang et al..
\cite{zhang02using}.
+''small-world phenomenon'' has been proposed by Zhang et al.
\cite{zhang02using}.
Since tightly structured systems have efficient data lookup at the application
level overlay,
@@ -1161,21 +1167,21 @@
To make Peer-to-Peer systems even more popular (and usable), these systems
have to support flexible, efficient
and easy to use search methods. For instance, Internet's perhaps the most
important feature
is the ability to perform keyword or fuzzy searches (e.g., Google). Currently,
only loosely
-structured systems are able carry out this requirement. Unfortunately, as
discussed in this text,
+structured systems are able to carry out this requirement. Unfortunately, as
discussed in this text,
the data lookup model of loosely structured approach is not scalable. Thus,
research efforts have
been focused on tightly structured approach.
The main problem with tightly structured approach is the fact that tightly
structured algorithms
-performs data lookups based on a globally unique identifier (key). Quite
recent study has been focused
+perform data lookups based on a globally unique identifier (key). Quite recent
study has been focused
on the feasibility of Peer-to-Peer Web-like indexing and searching
\cite{li03feasibility} on top of
tightly structured overlays. Authors argue, that it is possible to implement
Peer-to-Peer Web-like search with certain radical compromises.
-First, Peer-to-Peer search engine may need to decrease result quality in order
make searching more
+First, Peer-to-Peer search engine may need to decrease result quality in order
to make searching more
efficient. Second, Peer-to-Peer systems must observe better the properties of
underlying network for
better performance.
Some studies have been concentrated on SQL-like queries \cite{harren02complex}
-in tightly structured overlays. Another approaches include adaption of data
lookup model of loosely
+in tightly structured overlays. Other approaches include adaption of data
lookup model of loosely
structured approach into tightly structured systems
\cite{ansaryefficientbroadcast03}, \cite{chord:om_p-meng}.
-Additional studies include additional layer upon overlay network
\cite{kronfol02fasdsearch},
+Some studies include additional layer upon overlay network
\cite{kronfol02fasdsearch},
\cite{joseph02p2players} and range queries \cite{andrzejak02rangequeries}.
Many techniques have been developed in order to provide more efficient search
indexing. As
@@ -1203,27 +1209,27 @@
structured system, peers join and leave the system constantly without any
restrictions. On the
other hand, however, peers in tightly structured system have less freedom,
i.e. overlay chooses peer's
neighbors on behalf of peer itself and maps data items randomly throughout the
overlay network. However,
-Peer-to-Peer system is \emph{never} in ''ideal'' state as it is always
evolving system.
+Peer-to-Peer system is \emph{never} in ''ideal'' state as it is continuouslys
evolving system.
-Current research has been focused on system management of tightly structured
systems, since all presented
+Current research has been focused on system management of tightly structured
systems, and all presented
algorithms of tightly structured approach have been analyzed under static
simulation environments. Furthermore, proposed
tightly structured overlays are configured statically to achieve the desired
reliability even in uncommon and adverse environment
\cite{rowston03controlloingreliability}. The most important factor for
-future research is to get real-life experiences from tightly structured
system, when there are frequent
+future research is to get real-life experiences from tightly structured
systems, when there are frequent
joins and leaves in the system. Some research has been done already in this
area.
A concept of ''half-life'' was introduced by Liben-Nowell
\cite{libennowell01observations}. Half-life is defined
as follows: let there be $N$ live nodes at time $t$. The doubling from time
$t$ is the time that pass before
$N$ new additional nodes arrive into the system. The halving time from time
$t$ is the time
-requires for half of the living nodes at time $t$ to leave the system. The
half-life from
+required for half of the living nodes at time $t$ to leave the system. The
half-life from
time $t$ is smaller of the properties stated above. The half-life of the
entire system is the
-minimum half-life over all times $t$. Concept of half-time can be used as
basic for developing
-more efficient analytical tools for modeling complex Peer-to-Peer system.
+minimum half-life over all times $t$. Concept of half-time can be used as a
basis for developing
+more efficient analytical tools for modeling complex Peer-to-Peer systems.
Some research has been done with regard to load balancing properties of
tightly structured
overlays. Byers et al. suggest "power of two choices" whereby data item is
stored at the less loaded
of two (or more) random peer alternatives \cite{byers03dhtbalancing}. Rao et
al. uses virtual servers
-to control load balance in Peer-to-Peer systems \cite{rao03loadbalancing}.
Their work rests on
+to control load balance in Peer-to-Peer systems \cite{rao03loadbalancing}.
Their work rests on the
idea which was originally introduced by Chord \cite{stoica01chord} system.
Also, query and routing hot spots may be an issue in tightly structured
overlays \cite{ratnasamy02routing}.
@@ -1234,24 +1240,24 @@
As mentioned before, an implicit assumption of almost every tightly structured
system is that there is random, uniform
distribution of peer and key identifiers. Even if participating peers are
extremely heterogeneous, e.g., in
-face of computing power or network bandwidth, all data items are distributed
uniformly. Clearly, this
+face of computing power or network bandwidth, all data items are distributed
uniformly. Clearly, this is
a serious problem of tightly structured overlays in face of performance and
load balancing. Measurement study
by Saroiu et al. shows that there is extreme heterogeneity among participating
peers in already deployed Peer-to-Peer
-systems \cite{saroiu02measurementstudyp2p}. Symphony seems to be the first
tightly structured overlay system
-which support heterogeneity. Zhao et al. have proposed a secondary layer a top
of structured overlay
+systems \cite{saroiu02measurementstudyp2p}. Symphony \cite{gurmeet03symphony}
seems to be the first tightly structured overlay system
+which supports heterogeneity. Zhao et al. have proposed a secondary layer on
top of structured overlay
to support heterogeneity better \cite{zhao02brocade}.
Research has been done on self-organization. Ledlie et al. propose techniques
for forming and maintaining
groups in highly dynamic environment \cite{ledlie02selfp2p}. Unfortunately
their work relies on idea that
-participating peers would create multiple hierarchical groups; it's not clear
whether this approach
+participating peers would create multiple hierarchical groups. It's not clear
whether this approach
is fault-tolerant and suitable to Peer-to-Peer environment. More promising
work has been done by Rowston et al.
in \cite{rowston03controlloingreliability}. Authors propose techniques for
self-tuning, dealing with
uncommon conditions (e.g., network partition and high failure rates).
Moreover, authors argue that
-with these techniques, the concerns over the tightly structured overlay
maintenance costs are no more
+with these techniques, the concerns over tightly structured overlay
maintenance costs are no more
an open issue.
Finally, little research has been done regarding self-monitoring and data
availability. Zhang et al.
-describe a arbitrary data structure on top of tightly structured overlay
\cite{zhang03somo}. They
+describe an arbitrary data structure on top of tightly structured overlay
\cite{zhang03somo}. They
call their proposal as \emph{data overlay}, since it supports several
fundamental data structures.
Authors use this data overlay to build Self-Organized Meta data Overlay
(SOMO), which can be used
for monitoring health of tightly structured overlay. Fault tolerance of SOMO
itself is currently
@@ -1264,7 +1270,7 @@
\subsection{Programming guidelines and benchmarks}
-All existing Peer-to-Peer systems have rather different interfaces even they
have common points and
+All existing Peer-to-Peer systems have rather different interfaces even though
they have common properties and
components. More important, all existing Peer-to-Peer systems are incompatible
with each other. One
of the most important area of future research is to create common programming
abstractions, i.e.,
interfaces, design patters and frameworks. Also, equal benchmarks are needed
for comparing
@@ -1275,13 +1281,13 @@
\subsection{Social behavior}
Frequent assumption in Peer-to-Peer systems is that peers are willing to
cooperate. Another belief
-is that all peers would behave equally, i.e., all peers both consume resources
and contributes resources.
-However, these assumptions are not true as several studies show peers rather
consume than contribute and
-and peers are unwilling to cooperate \cite{saroiu02measurementstudyp2p},
\cite{oram01harnessingpower},
+is that all peers would behave equally, i.e., all peers both consume and
contribute resources.
+However, these assumptions are not true as several studies show. Peers rather
consume than contribute and
+peers are unwilling to cooperate \cite{saroiu02measurementstudyp2p},
\cite{oram01harnessingpower},
\cite{hearn02mojonation}.
-Somewhat surprisingly little research has been in this area, especially when
considering
-the possible impact of this \emph{unwanted social behavior} to performance of
Peer-to-Peer
+Somewhat surprisingly little research has been done in this area, especially
when considering
+the possible impact of \emph{unwanted social behavior} to performance of
Peer-to-Peer
system. Problem is addressed by Golle et al. \cite{golle01incentivesp2p}. Some
research has been focused on semantic properties of the overlay in order to
increase
cooperation among participating peers \cite{crespo02semanticoverlay}.
Ramanathan et al.
@@ -1291,31 +1297,31 @@
Peer-to-Peer system, which uses empirical metrics for peer selection.
-\subsection{Simulating the Peer-to-Peer system}
+\subsection{Simulating Peer-to-Peer systems}
-Very little research has been done on simulating the \emph{global}
Peer-to-Peer system. Presumably, this
+Very little research has been done on simulating a \emph{global} Peer-to-Peer
system. Presumably, this
is due to complex nature of Peer-to-Peer system, which makes comprehensive
simulations very
difficult. Floyd et al. has been studying the simulation of the Internet in
\cite{504642}. Authors
state that simulating the Internet is very challenging task, because of
Internet's heterogeneity
-and rapid change. Obviously, these factors exist also in Peer-to-Peer system
even with higher
+and rapid change. Obviously, these factors exist also in Peer-to-Peer systems
even with higher
rates.
-As long as global simulations of Peer-to-Peer systems are lacking, we cannot
we make any further
-analysis e.g., on usage patterns in Peer-to-Peer systems. However, we can
assume that
-, e.g., query keywords follow the Zipf-like distributions
\cite{breslau98implications} both in the
+As long as global simulations of Peer-to-Peer systems are lacking, we cannot
make any detailed
+analysis on usage patterns in Peer-to-Peer systems. However, we can assume
that, e.g.,
+query keywords follow the Zipf-like distributions \cite{breslau98implications}
both in the
Internet and in Peer-to-Peer systems.
\section{Summary}
-In this section we summarize open problems in Peer-to-Peer systems. All open
problems entries
+In this section we summarize open problems in Peer-to-Peer systems. All open
problem entries
listed in this section are not necessarily mentioned in the previous sections.
Problems listed
here are variations of previously mentioned problems, or otherwise related to
them. For each entry,
-there is brief description of the problem, possible solutions and comments
respectively.
+there is a brief description of the problem, possible solutions and comments
respectively.
Next, we list open problems in Peer-to-Peer domain. In table
\ref{table_security_problems_Peer-to-Peer}
-we list open problems related to security; in table
\ref{table_performanceusability_problems_Peer-to-Peer},
-we list problems related to performance; in table
\ref{table_Miscellaneous_problems_Peer-to-Peer},
-we list miscellaneous open problems.
+open problems related to security are listed; in table
\ref{table_performanceusability_problems_Peer-to-Peer},
+problems related to performance are listed; in table
\ref{table_Miscellaneous_problems_Peer-to-Peer},
+miscellaneous open problems are listed.
\scriptsize
@@ -1620,7 +1626,7 @@
\parbox{90pt}{Comprehensive simulations/analysis of Peer-to-Peer network} &
\parbox{110pt}{Ability to simulate whole Peer-to-Peer network's usage
patterns, network traffics, flux state etc} &
\parbox{110pt}{Use same techniques as simulating/analyzing the Internet} &
-\parbox{110pt}{Only small subset of Peer-to-Peer networks has been able to
22, because of ad hoc properties of network, more powerful solutions needed}
+\parbox{110pt}{Only small subsets of Peer-to-Peer networks has been analysed,
because of ad hoc properties of network, more powerful solutions needed}
\\ \hline
Index: gzz/Documentation/misc/hemppah-progradu/progradu.bib
diff -u gzz/Documentation/misc/hemppah-progradu/progradu.bib:1.103
gzz/Documentation/misc/hemppah-progradu/progradu.bib:1.104
--- gzz/Documentation/misc/hemppah-progradu/progradu.bib:1.103 Fri Mar 7
08:23:37 2003
+++ gzz/Documentation/misc/hemppah-progradu/progradu.bib Wed Mar 12
07:57:42 2003
@@ -2124,6 +2124,20 @@
NUMBER = 2396
}
address@hidden,
+ author = {Rüdiger Schollmeier},
+ title = {A definition of Peer-to-Peer Networking for the Classification
of Peer-to-Peer Architectures and Applications},
+ booktitle = {Proceedings of the First International Conference on
Peer-to-Peer Computing {(P2P'01)}},
+ year = {2001}
+}
-
address@hidden,
+ author = {S. R. Waterhouse and D. M. Doolin and G. Kan and Y.
Faybishenko},
+ title = {Distributed Search in Peer-to-Peer Networks},
+ journal = {IEEE Internet Computing},
+ volume = {6},
+ pages = {68--73},
+ year = {2002},
+ url = {http://www.jxta.org/project/www/docs/jsearch.pdf}
+}
- [Gzz-commits] gzz/Documentation/misc/hemppah-progradu mastert..., (continued)
- [Gzz-commits] gzz/Documentation/misc/hemppah-progradu mastert..., Hermanni Hyytiälä, 2003/03/06
- [Gzz-commits] gzz/Documentation/misc/hemppah-progradu mastert..., Hermanni Hyytiälä, 2003/03/06
- [Gzz-commits] gzz/Documentation/misc/hemppah-progradu mastert..., Hermanni Hyytiälä, 2003/03/06
- [Gzz-commits] gzz/Documentation/misc/hemppah-progradu mastert..., Hermanni Hyytiälä, 2003/03/06
- [Gzz-commits] gzz/Documentation/misc/hemppah-progradu mastert..., Hermanni Hyytiälä, 2003/03/06
- [Gzz-commits] gzz/Documentation/misc/hemppah-progradu mastert..., Hermanni Hyytiälä, 2003/03/07
- [Gzz-commits] gzz/Documentation/misc/hemppah-progradu mastert..., Hermanni Hyytiälä, 2003/03/07
- [Gzz-commits] gzz/Documentation/misc/hemppah-progradu mastert..., Hermanni Hyytiälä, 2003/03/07
- [Gzz-commits] gzz/Documentation/misc/hemppah-progradu mastert..., Hermanni Hyytiälä, 2003/03/12
- [Gzz-commits] gzz/Documentation/misc/hemppah-progradu mastert..., Hermanni Hyytiälä, 2003/03/12
- [Gzz-commits] gzz/Documentation/misc/hemppah-progradu mastert...,
Hermanni Hyytiälä <=
- [Gzz-commits] gzz/Documentation/misc/hemppah-progradu mastert..., Hermanni Hyytiälä, 2003/03/12
- [Gzz-commits] gzz/Documentation/misc/hemppah-progradu mastert..., Hermanni Hyytiälä, 2003/03/12
- [Gzz-commits] gzz/Documentation/misc/hemppah-progradu mastert..., Hermanni Hyytiälä, 2003/03/13
- [Gzz-commits] gzz/Documentation/misc/hemppah-progradu mastert..., Hermanni Hyytiälä, 2003/03/13
- [Gzz-commits] gzz/Documentation/misc/hemppah-progradu mastert..., Hermanni Hyytiälä, 2003/03/13
- [Gzz-commits] gzz/Documentation/misc/hemppah-progradu mastert..., Hermanni Hyytiälä, 2003/03/13
- [Gzz-commits] gzz/Documentation/misc/hemppah-progradu mastert..., Hermanni Hyytiälä, 2003/03/13
- [Gzz-commits] gzz/Documentation/misc/hemppah-progradu mastert..., Hermanni Hyytiälä, 2003/03/13
- [Gzz-commits] gzz/Documentation/misc/hemppah-progradu mastert..., Hermanni Hyytiälä, 2003/03/13
- [Gzz-commits] gzz/Documentation/misc/hemppah-progradu mastert..., Hermanni Hyytiälä, 2003/03/13