[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Gzz] *Sigh.* Fwd: ACM HT03: Paper Results
From: |
b . fallenstein |
Subject: |
[Gzz] *Sigh.* Fwd: ACM HT03: Paper Results |
Date: |
Mon, 7 Apr 2003 18:17:44 +0200 (MEST) |
--- Weitergeleitete Nachricht / Forwarded Message ---
Date: Mon, 7 Apr 2003 16:12:21 +0100
From: Les Carr <address@hidden>
To: address@hidden
Subject: ACM HT03: Paper Results
>
> I am sorry to inform you that your submission
> Storm: Supporting data mobility through location-independent identifiers
> has not been chosen as a full paper for the ACM Hypertext 2003 conference.
> The procedures have been very thorough, and only 25% of the submissions
> were selected as full papers. Most papers were reviewed by 5 referees
whose
> comments are attached below.
>
> If you look at the Web site (www.ht03.org) you will see there are many
> categories of dissemination still available. You may wish to submit a
version
> of your work in the form of a two-page Short paper, a four-page Technical
> Briefing, a Poster or a Demonstration (deadline for all categories 30th
> May).
>
> The Program Committee is keen to see your work represented at the
> conference in the following context:
>
> *** The committee particularly felt that the results that you
> describe are are of sufficient scientific value to warrant exposure
> as a Short Paper. If you would be prepared to submit a version of
> your work to appear as a Short paper, please email the Paper Chair
> address@hidden
>
>
>
> Thank you for your interest in HT03: as you can see from the web site
> there are many opportunities still available. We are anticipating a lively
> conference and would value your participation.
> ---
> Les Carr & Lynda Hardman
> HT03 PC CoChairs
>
> ===================== REVIEWS ====================
> REVIEWER SCORE: 4
> ***Comments to Authors
>
>
>
> ***Summary Comments
> Describes a storage system that uses hash codes
> for location-independent identification of data blocks.
> The current implementation supports just one
> storage system, no distribution!
>
>
> --------------------------------------------------------
> REVIEWER SCORE: 8
> ***Comments to Authors
> I rather like the paper. It describes good research with good motivations,
> a good grasp of the issues involved, and more than adequate familiarity
> with the literature.
>
> My only possible objection is that the implementation of the described
> system is still, I understand, at a fairly rudimental stage, and many as
yet
> unforeseeable issues may arise that may prevent your ideas to reach full
> implementation. I suggest the authors to update the paper as to the latest
> result of their research, and to be ready, during the conference, to
provide
> additional info.
>
>
>
>
> ***Summary Comments
> Definitely accept
>
>
> --------------------------------------------------------
> REVIEWER SCORE: 3
> ***Comments to Authors
> This paper discusses the estimable aim of creating location-independent
> identifiers for all addressable objects in a way that ensure they do not
> break.
>
> The originality of the work lies mostly in the combination of existing
> technologies to this problem. The components of the solution, global
> identifiers, append-and-delete-only storage, peer-to-peer networking, are
all
> well-established.
>
> However, there is a serious oversight which makes the scheme unreliable in
> its current form. This is the use of a cryptographic hash to generate an
> unique ID. Cryptographic hashes are a means for creating what is often
> called a "fingerprint" for a file, but unlike human fingerprints, hashes
are not
> unique to documents - perhaps this "fingerprint" name is what has misled
> the authors of this paper. Hashes lose information, so what happens is
that
> there are numerous (potential or actual) documents all possessing the same
> hash. With a cryptographically robust hash, we find that a small change in
> the document results in a radical change in the hash (the authors note
> this) but also that for a given hash, it is computationally infeasible to
> discover another document with the same hash, although they do exist (up
to a
> certain size of hash, they must exist, just from teh pigeon hole
principle).
> So, there are likely to be pairs of documents with the same hash, and it
> will be difficul!
> t to predict which ones will have the same hash until the "collision"
> actually occurs.
>
> While the duplication of IDs will have a relatively low probability for
> any given file, it will have a non-trivial probability of happening for a
> number of files in a collection as large as the set of all files
addressable
> on the Internet.
>
> It is an excellent idea, to create unique global file IDs from the
> document`s content, especially for this purpose, however cryptographic
hashes are
> the wrong tool. I suspect that it may not be possible, without imposing
> some upper limit either on file size or on the number of files. The
smallest
> absolutely unique document identifier you can reliably generate from the
> document content is essentially going to be the same size as the best
possible
> compression for that document - any smaller and you lose information and
> cannot be certain which if a number of possible files is the one to which
> the ID belongs.
>
>
> ***Summary Comments
>
>
>
> --------------------------------------------------------
> REVIEWER SCORE: 2
> ***Comments to Authors
> This research and partial implementation report attempts to provide
> solutions to the problem of dangling links and multiple concurrent
versions of
> resources in an environment where resources can move around the system and
> still be navigated, using xanalogical storage techniques. However, it has
one
> critical flaw and fails to clearly address the issues that it raises.
>
> The Introduction states that the system is designed explicitly to use
> `emerging` peer-to-peer data lookup technologies, arguments aside about
> peer-to-peer being nothing new (and therefore not `emerging`), the paper
later
> states that no peer-to-peer implementation exists, and the implication is
that
> it is unlikely to for some time as "[m]any practical problems have to be
> overcome". Given this state of affairs, this paper should have been
written
> as a discussion paper detailing the issues of mobility more clearly,
> relating a proposed design against existing partial implementations rather
than
> an undefendable implementation report.
>
> Whilst the two issues of focus (Dangling Links and Multiple Concurrent
> Versions) are interesting, other pressing issues regarding the impact of
such
> data mobility, including provenance, rights management and accountability
> are ignored.
>
> With regards the Storm Block Storage section, the choice of language and
> use of unexplained side-comments are frustrating. Storm proposes the
> xanalogical approach of `chunking` content into immutable blocks that are
> identified by one-way hashes of their content. However, this introduces
other
> pressing user problems, including how one finds a document when it`s
identifier
> is a 160-bit hexadecimal string, and how policy models can be extended to
> cover multiple layers of content (blocks, updated - and thus different -
> blocks, documents, etc.)
>
> It is also unclear how documents are named, distributed, and thus
> discovered and navigated, if a document is a virtual file of references to
spans
> (blocks and offsets and lengths therein). Is there another DHT for
document
> identifiers sat atop of the the DHT that serves to `locate` blocks on a
peer
> network? If so, how are those identifiers generated?
>
> Both the section describing Implementation and Application-specific
> reverse indexing would benefit for examples of API use, and descriptions
of what
> the operations identified actually mean, perhaps with relation made to
> existing distributed information systems.
>
> The section pertaining to dangling links in the Discussion section is
> questionable given the previous statement that no distributed (whether
> client-server or peer-to-peer) implementation of Storm exists. The
conclusion that
> "Storm is no limited to network publishing" seems a little premature given
> that network publishing is one activity that Storm cannot currently
> achieve.
>
>
> ***Summary Comments
>
>
>
> --------------------------------------------------------
> REVIEWER SCORE: 7
> ***Comments to Authors
> - Rerun a spellchecker - spelling errors I found
> Last paragraph Sec 2.1: s/usesing/using/
> 2nd paragraph, Sec 7: s/bacause/because/
>
> Section 2.1
>
> - Add heading for related work on alternative versions
> - Not clear how Groove supports alternative versions
> in your description
>
> Section 2.2
>
> - The description of DHT in the "related work"
> section is a bit too detailed - if this is
> important for your work, it should be described
> elsewhere (I don`t think it is as relevant right
> now, given that you have not yet finished the P2P
> implementation)
>
> Section 3
>
> - You should explain Figure 2 a bit more, even though you
> submitted another paper to HT`03 explaining it -
> this paper should be "stand-alone".
>
> Section 4
>
> - When you talk about the "persistency commitement",
> you might want to cite Tim Berners-Lee`s write-up -
> I think this idea originates from him (although he
> would like to see it applied to URIs)
>
> Section 4.1
>
> - Para 5: you mention SHA-1-based identifiers, but
> I think you have not explained what they are anywhere
> in the text
>
> - Para 6: rather than talking about "network
> address translation", use the term NAT, which is
> more familiar to people
>
> Section 5
>
> - It is not clear which applications put things into
> the index - are these editors ?
>
> - Reading Section 5, I am unclear on how you find
> links - you talk about transclusions, but not about
> links
>
> Section 6.2
>
> - Heading needs a line-break
>
> Section 8
>
> - you talk about integrating STORM into other systems.
> It would be interesting to state whether it can
> be integrated into today`s Web, and if yes, how.
>
>
>
>
> ***Summary Comments
> The paper describes a hypertext system based
> on location-independent identifiers, and convincingly
> argues that it could be able to address some of the
> issues of today`s Web by making use of a combination
> of location-independent identifiers and P2P technology.
> Unfortunately, the P2P solution has not yet been
> implemented. Nevertheless, I think this is a very
> interesting paper, and should be accepted.
>
> After discussion within the PC, I feel that the authors
> should explain why they believe that an SHA-1 hash can
> be used as a global unique identifier, and how they think
> potential collisions should be handled.
>
>
> --------------------------------------------------------
> ***Comments to Authors
> Your paper caused a fair amount of discussion among the reviewers, which
> is
> a fine quality of a paper. While many found your approach interesting, it
> was however the consensus that your work is still immature - i.e. you have
> devised your block storage system and a naming scheme, but other crucial
> parts of a functioning system is still missing. As you can see, some
> reviewers found the foundation (cryptographic hashes) for your naming
> scheme
> to be highly problematic, and as this is very central to your system, you
> might want to examine this issue in depth. There is however no doubt that
> you are doing interesting and significant work, but not yet finished
> enough
> to warrant a full paper. You are encouraged to present a short paper,
> focusing on your central contributions.
>
>
>
> ***Summary Comments
>
>
>
> --------------------------------------------------------
>
- [Gzz] *Sigh.* Fwd: ACM HT03: Paper Results,
b . fallenstein <=