[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Gzz-commits] gzz/Documentation/misc/hemppah-progradu mastert...
From: |
Hermanni Hyytiälä |
Subject: |
[Gzz-commits] gzz/Documentation/misc/hemppah-progradu mastert... |
Date: |
Thu, 09 Oct 2003 05:49:01 -0400 |
CVSROOT: /cvsroot/gzz
Module name: gzz
Branch:
Changes by: Hermanni Hyytiälä <address@hidden> 03/10/09 05:49:01
Modified files:
Documentation/misc/hemppah-progradu: masterthesis.tex
Log message:
Barbara's comment #4
CVSWeb URLs:
http://savannah.gnu.org/cgi-bin/viewcvs/gzz/gzz/Documentation/misc/hemppah-progradu/masterthesis.tex.diff?tr1=1.210&tr2=1.211&r1=text&r2=text
Patches:
Index: gzz/Documentation/misc/hemppah-progradu/masterthesis.tex
diff -u gzz/Documentation/misc/hemppah-progradu/masterthesis.tex:1.210
gzz/Documentation/misc/hemppah-progradu/masterthesis.tex:1.211
--- gzz/Documentation/misc/hemppah-progradu/masterthesis.tex:1.210 Wed Oct
8 09:24:48 2003
+++ gzz/Documentation/misc/hemppah-progradu/masterthesis.tex Thu Oct 9
05:49:00 2003
@@ -1731,61 +1731,59 @@
\chapter{Evaluation of Peer-to-Peer for Fenfire}
-In this chapter we evaluate Fenfire in Peer-to-Peer environment.
+In this chapter we evaluate Fenfire in the Peer-to-Peer environment.
We start by giving a problem overview. Then, we define special needs and
evaluate existing
Peer-to-Peer approaches in light of these requirements. After that, we propose
a combination
of Peer-to-Peer techniques reviewed in this thesis to be used with Fenfire and
present simple methods to perform data
-lookups. Data lookups are required by Alph module which implements the
xanalogical storage model in Fenfire.
+lookups. Data lookups are required by an Alph module, which implements the
xanalogical storage model in Fenfire.
In the end of this chapter, we discuss possible problems of using Fenfire
in Peer-to-Peer environment.
\section{Problem overview}
-Some research regarding to Peer-to-Peer technologies and hypermedia systems
have been made by Lukka et al.
-\cite{lukka02freenetguids}. Authors' work is mainly based on the insight of
implementing
+Several research studies regarding to Peer-to-Peer technologies and hypermedia
systems have been made by Lukka et al.
+\cite{lukka02freenetguids}. The authors' work is mainly based on the insight
of implementing
the xanalogical storage model in Peer-to-Peer environment with globally unique
identifiers. Lukka et al.
-use Freenet \cite{clarke00freenet} as an example Peer-to-Peer system
supporting
+use Freenet \cite{clarke00freenet} as an sample Peer-to-Peer system supporting
globally unique identifiers. The work presented in this thesis extends their
work by
evaluating different Peer-to-Peer systems more extensively to Fenfire's needs.
Additionally, related to non-xanalogical hypermedia systems, Bouving
\cite{bouvin02openhypermedia} has done initial work regarding ways in which
Peer-to-Peer can be used in non-xanalogical hypermedia systems. Thompson and
de Roure
-\cite{thompson01hypermedia} have studied locating documents and links in
Peer-to-Peer
-environment. At the Hypertext '02 panel, moderated by Wiil
\cite{wiil02p2phypertext},
-participants responded whether Peer-to-Peer systems are suitable for hypermedia
-publishing or not.
-
-In Peer-to-Peer environment, our objectives are simple but yet hard to fulfill.
-First, as discussed in chapter 4, xanalogical document is a ''virtual
-file'', in which parts of the document are fetched from a
-\emph{global} data repository\footnote{Global repository is not a requirement.
Locally constructed xanalogical
-documents are feasible and they can be assembled without global data
repository.}. System implementing the xanalogical storage model \emph{must}
-support global scale data lookups, i.e., if a data item exists in the system
it can be located and fetched.
-Specifically, our task is to locate and fetch (i.e., obtain) all Storm blocks,
-associated to a specific ''virtual file'' from the Peer-to-Peer
network\footnote{We call''associated'' blocks as \emph{scroll} blocks.}. Also,
in addition to the
-\emph{direct} block obtaining using globally unique identifier of Storm block,
-we also must support the \emph{indirect} obtaining of Storm block using the
pointer mechanism.
-Second, we want that users' operations in Fenfire
-are location transparent: data lookups have to be efficient, since
constructing
-one ''virtual file'' may need obtaining several Storm blocks, which are
distributed
-randomly throughout the overlay. If not efficient, construction of the
''virtual file''
-may take reasonable amount of time while rendering system very unusable.
Third, Peer-to-Peer
-infrastructure has to be scalable, fault tolerant against hostile attacks and
resilience in
+\cite{thompson01hypermedia} have studied locating documents and links in the
Peer-to-Peer
+environment. At a Hypertext '02 panel, moderated by Wiil
\cite{wiil02p2phypertext},
+participants debated whether or notPeer-to-Peer systems are suitable for
hypermedia
+publishing.
+
+In a Peer-to-Peer environment, our objectives are simple, yet hard to fulfill.
+First, as discussed in Chapter 4, a xanalogical document is a ''virtual
+file'' in which parts of the document are fetched from a
+global data repository\footnote{A global repository is not a requirement.
Locally constructed xanalogical
+documents are feasible and they can be assembled without a global data
repository.}. The system implementing the
+xanalogical storage model must support global scale data lookups, i.e., if a
data item exists in the system it
+can be located and fetched. Specifically, our task is to locate and fetch all
Storm blocks,
+associated to a specific virtual file from the Peer-to-Peer
network\footnote{We call ''associated'' blocks \emph{scroll} blocks.}.
+In addition to the direct block obtaining using the globally unique identifier
of the Storm block,
+we must also support the indirect obtaining of the Storm block using the
pointer mechanism.
+Second, we want users' operations in Fenfire
+to be location transparent: data lookups have to be efficient, since
constructing
+one virtual file may require obtaining several Storm blocks, which are
distributed
+randomly throughout the overlay. If not efficient, construction of the virtual
file
+may take a reasonable amount of time while rendering system very unusable.
Third, the Peer-to-Peer
+infrastructure has to be scalable, fault tolerant against hostile attacks and
resilient in
adverse conditions (e.g., a network partition).
\section{Evaluation of Peer-to-Peer approaches with regard to Fenfire}
-In this section we focus on locating the Storm blocks in Peer-to-Peer
environment. We don't
-respond to fetching of Storm blocks as fetching can be performed easily once
-Storm block is located.
+In this section we focus on locating the Storm blocks in a Peer-to-Peer
environment.
-In chapter 2, we discussed the main differences between the loosely and the
tightly structured
+In Chapter 2, we discussed the main differences between the loosely and the
tightly structured
approach. As stated, the most significant difference is that the tightly
structured
approach has at least poly-logarithmical properties in all internal
operations, while the loosely
structured approach does not always have even linear properties. Furthermore,
the
-data lookup model of the tightly structured overlay scales much better than in
loosely
+data lookup model of the tightly structured overlay scales much better than
the loosely
structured overlays; the tightly structured overlay supports global data
lookups
in the overlay, whereas the data lookup model of the loosely structured
approach
is limited to a certain area of the overlay\footnote{The area depends on where
the query
@@ -1793,19 +1791,19 @@
is more efficient and scalable than the loosely structured approach.
Since both Storm and tightly structured overlays use globally unique
identifiers for each data item,
-it is feasible to use tightly structured overlays for \emph{locating} Storm
blocks efficiently.
+it is feasible to use tightly structured overlays for locating Storm blocks
efficiently.
Additionally, the unstructured and semantic-free properties of Storm
identifiers enables
the use of general purpose Reference Resolution Services (RRS)
\cite{balakrishnan03semanticfree} on
top of the tightly structured overlay.
Again, there are research challenges with tightly structured systems which
have to be
-addressed, as described in chapter 3. The main concerns include decreased
performance and fault
+addressed, as described in Chapter 3. The primary concerns include decreased
performance and fault
tolerance in presence of system flux, non-optimal distance functions in
identifier space,
proximity routing, hostile entities and flexible search
\cite{balakrishanarticle03lookupp2p}.
-Additionally, there is only little real world experiments with tightly
structured systems
-(e.g., \cite{overneturl, edonkey2kurl}). Therefore, we cannot say for sure,
how well these
-systems would perform in real Peer-to-Peer environment. However, we believe
that these issues will be
-solved in the near future, since there is a strong and wide research community
towards tightly structured
+Additionally, there are few real world experiments with tightly structured
systems
+(e.g., \cite{overneturl, edonkey2kurl}). Therefore, we cannot say explicitly,
how well these
+systems would perform in a real Peer-to-Peer environment. However, we believe
that these issues will be
+solved in the near future, since much current research is concentraing on
tightly structured
overlays \cite{projectirisurl}.
@@ -1817,54 +1815,54 @@
\subsection{System proposal}
-We emphasize that we prefer \emph{abstraction}
-level analysis as very recently better and better tightly structured
algorithms have been proposed.
+We emphasize that we prefer abstraction
+level analysis because greatly improved tightly structured system designs have
been proposed recently.
Thus, we don't want to bind our system proposal to a specific algorithm
definitively as we expect
-that this development continues.
+that on going development will continue.
Currently, we see Kademlia \cite{maymounkov02kademlia} as the best algorithm
for
locating data efficiently in the Peer-to-Peer overlay. There are two
reasons for this. First, Kademlia's XOR-based distance function is superior
-over distance functions of other systems (see section 2.3.2). Secondly,
Kademlia
-is one of the only tightly structured systems that has been deployed in real
life
+to the distance functions of other systems (see section 2.3.2). Secondly,
Kademlia
+is one of the few tightly structured systems that has been deployed in
practical applications
(e.g., \cite{overneturl, edonkey2kurl, kashmirurl,kato02gisp}), which means
that
Kademlia's algorithm is simple and easy to implement.
-On top of Kademlia, we propose the use of Sloppy hashing \cite{sloppy:iptps03}
which
-is optimized for the DOLR abstraction of tightly structured overlays. With
Sloppy hashing
-we can provide locality properties for the Fenfire system which may be useful
-within a small group of working people.
+In addition to Kademlia, we propose the use of sloppy hashing
\cite{sloppy:iptps03} which
+is optimized for the DOLR abstraction of tightly structured overlays. With
sloppy hashing
+we can provide locality properties for the Fenfire system with regard to the
routing
+algorithm.
For better fault tolerance and self-monitoring for Fenfire, we propose
techniques
presented by Rowston et al. \cite{rowston03controlloingreliability}. With
their methods,
-we can ensure the performance of the Fenfire system in a highly adverse
conditions, such
-as sudden network partition, or highly dynamic and heterogeneous environment.
+we can ensure the performance of the Fenfire system within highly adverse
conditions, such
+as a sudden network partition, or a highly dynamic and heterogeneous
environment.
Additionally, for more efficient data transfer, we can use variable techniques
for this purpose.
-For small amounts of data, HTTP can be used \cite{rfc2068}. For large amounts
of data, we can use
+For small amounts of data, HTTP protocol can be used \cite{rfc2068}. For large
amounts of data, we can use
multisource downloads for better efficiency and reliability. Specifically, the
technology based
on rateless erasure codes \cite{maymounkov03ratelesscodes} seems very
promising.
-Furthermore, multisource downloads can be used for decreasing load of a
certain peer, thus avoiding query
+Furthermore, multisource downloads can be used for decreasing the load of a
certain peer, thus avoiding query
hot spots in the system \cite{ratnasamy02routing}. Current client-server
implementation of Fenfire uses
standard single source downloads (HTTP) and SHA-1 \cite{fips-sha-1}
cryptographic content
hash for verifying the integrity of data. In face of multisource downloads,
Fenfire must support
tree-based hashes\footnote{With multisource downloads, tree-based hash
functions can be used
to verify fixed length segments of data. If hash value of data segment is
incorrect,
-we need only to fetch \emph{segment} of data (instead of whole data) from
-an other source.}, such as \cite{merkle87hashtree, mohr02thex} for reliable
and efficient
-data validation.
+we need only to fetch a \emph{segment} of data (instead of entire data) from
+an other source.}, such as Merkle's \cite{merkle87hashtree} or Mohr's
\cite{mohr02thex} techniques
+for reliable and efficient data validation.
\subsection{Methods}
In our data lookup methods, we use the DOLR abstraction of the tightly
structured approach since DOLR systems locate data without
specifying a storage policy explicitly \cite{rhea03benchmarks}, i.e., each
participating peer hosts
-the data they are offering and the overlay maintains the \emph{pointers} to
the data.
+the data they are offering and the overlay maintains the pointers to the data.
Storage systems based on the DHT abstraction, such as CFS
\cite{dabek01widearea} and PAST \cite{rowstron01storage}, may have
-severe problems with load balancing in a highly heterogeneous environment
\cite{rao03loadbalancing}. The problem is caused by peers
-which may not be able to store relatively large blocks, assigned randomly by
the mapping function of the overlay.
+severe difficulties with load balancing in a highly heterogeneous environment
\cite{rao03loadbalancing}. The problem is caused by peers
+that may not be able to store relatively large blocks, assigned randomly by
the mapping function of the overlay.
-For simplicity, we assume that we have resolved the construction of the
''virtual file'' before locating any Storm blocks, i.e.,
-when assembling the ''virtual file'' we know all the Storm blocks, which are
required to complete the ''virtual file''.
+For simplicity, we assume that we have resolved the construction of the
virtual file before locating any Storm blocks:
+when assembling the virtual file, we know all of the Storm blocks that are
required to complete the virtual file.
Also, we don't respond to the security issues related to Peer-to-Peer systems,
since there is no working solution
available yet. We either assume that Fenfire has a reliable technique for
identifying individual entities, or
there are no hostile entities among participating peers, i.e., Storm blocks
can be identified correctly (e.g., when
@@ -1921,28 +1919,28 @@
Perhaps the most biggest issue in Peer-to-Peer systems is the non-maturity of
security technologies. For the Fenfire system, one security related problem
occurs when a user wants to
-perform a global data lookup with a given pointer; how the user is able to
verify
-the correctness of the search results, e.g., how she or he knows which one is
the
-correct Storm block ? Another problem related to the Fenfire's
+perform a global data lookup with a given pointer: how ise the user able to
verify
+the correctness of the search results, e.g., how does she or he know which one
is the
+correct Storm block? Another problem related to the Fenfire's
security is that if a user downloads data from the network to local computer
-and after a network disconnection, user wants to verify \emph{off line} the
+and, after a network disconnection, the user wants to verify off--line the
authenticity of data. Finally, if a data lookup is performed by a user, but
there is no reply
from the Fenfire system, how are we able to know if this was a Spam attack
\cite{naor03simpledht},
-or the data really does not exist in the system ?
-These problems, however, are not only limited to the Fenfire system as it
+or the data really does not exist in the system?
+These problems, however, are not only limited to the Fenfire system but
concerns all Peer-to-Peer computer systems.
-Optimal solutions to all security issues would be that digital
-signatures are included to every message which are sent to the system or the
use of working PKI-based
+Optimal solutions to the majority of the security issues would be that digital
+signatures are included in every message sent to the system or the use of
working PKI-based
certificate distribution. As security technologies become more mature, we wish
to apply these
technologies with Fenfire, if applicable.
\chapter{Conclusions and future work}
In this thesis, we have reviewed existing Peer-to-Peer approaches, algorithms
and
-their properties. Our perception is that despite the great amount of
Peer-to-Peer systems,
+their properties. Our perception is that despite the great number of
Peer-to-Peer systems,
we are able to classify \emph{all} systems either to loosely or tightly
structured systems.
-We have summarized open problems in Peer-to-Peer research domain.
Specifically, we divided open
+We have summarized open problems in the Peer-to-Peer research domain.
Specifically, we divided open
problems into the three sub-categories: security related problems,
performance related problems and miscellaneous problems.
@@ -1950,16 +1948,16 @@
described Storm software module.
In the last chapter, we evaluated existing Peer-to-Peer approaches with regard
-to Fenfire's needs. Currently, we see that the tightly structured approach as
the
-best alternative to Fenfire's needs for the following reasons.
+to Fenfire's needs. Currently, we view the tightly structured approach as the
+best choice for to Fenfire's needs for three reasons.
First, both Storm and tightly structured overlays use globally unique
identifiers for each data item. Therefore,
-it is feasible to use tightly structured overlays for \emph{locating} Storm
blocks efficiently.
-Second, the unstructured and semantic-free properties of Storm identifiers
enables
-the use of general purpose Reference Resolution Services (RRS)
\cite{balakrishnan03semanticfree} on
-top of the tightly structured overlay. As the authors of
\cite{balakrishnan03semanticfree},
-we also agree that references should be semantically-free in next-generation
RRS systems.
-Third, we believe that issues related to tightly structured overlays will be
solved in
-the near future, because of wide and intensive co-operation among research
groups.
+it is feasible to use tightly structured overlays for locating Storm blocks
efficiently.
+Second, the unstructured and semantic-free properties of Storm identifiers
enable
+the use of general purpose Reference Resolution Services (RRS) on
+top of the tightly structured overlay. As Balakrishnan et al.
\cite{balakrishnan03semanticfree},
+we concur that references should be semantically-free in next-generation RRS
systems.
+Third, we believe current concerns related to tightly structured overlays will
be resolved in
+the near future, because of wide and intensive cooperation among research
groups.
Our future work includes a support for searching transclusions and xanalogical
links in Peer-to-Peer environment. Preliminary analysis have shown