[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Gzz] Re: the Storm article
From: |
Alatalo Toni |
Subject: |
[Gzz] Re: the Storm article |
Date: |
Fri, 7 Mar 2003 08:16:44 +0200 (EET) |
On Thu, 6 Mar 2003, Eric Armstrong wrote:
i quote extensively to inform the other authors and the actual designers
via the list -- must go teach some students ;) in 10mins but a few
reactions here first:
> First, I liked the article. A lot. I'm looking
> forward to using Storm, and so is Eugene (only
> took a 30 second presentation to get him
> interested).
glad to hear, and thank you very much about the detailed comments!
> I even started a review article "Taking the World
> by Storm", for publication in some broad-interest
> journal. (Not sure which one, though.)
i've also started working on a review, from a security point of view
related to mobile and ubiquitous computing environments (wireless
connections and small devices)
> Next, specifics.
can't go into most of this right now but probably later today, or perhaps
Benja, Tuomas or Hermanni before me (hence the full quote)
> Missing Ingredients
> -------------------
> These are things that need to be addressed in the
> article, however briefly, but are not currently
> mentioned:
> * How are collisions handled?
> (Surely some small blocks must produce the
> same cryptographic hash as other small blocks,
> sometimes.)
afaik it should not happen, but it is theoretically possible. so i guess
it's a good question :)
> * How are docs hashed? I didn't see a discussion of that.
>
> * What is the project storage impact?
> (Maybe only "publish" material goes into the system,
> or maybe storage is cheap and growing cheaper so
> we don't really care, but it needs to be mentioned.)
>
> * What language is it written in?
> (Or do I care? If it really is like a "file system",
> maybe I really don't?)
mostly Java, otherwise ex-gzz (now Fenfire) has been written also in
Python (Jython, for tests, demos and clients at least) and C++ (opengl
graphics API) .. but all Storm code I've seen is Java.
> * If there really is a "file system" gui, that's still
> going to be different from a shell, because I won't
> be able to launch any existing editors, will I? They'll
> need to write new files, not rewrite old ones -- and
> they'll need to understand blocks and transclusions.
yes. the file system implementation has been so far used to save data from
ex-gzz only, via the Storm API.
> * Short description of "Structured overlay networks".
> What they do, what they accomplish. (paragraph or two)
they are a type of peer-to-peer networks, overlay refers to how e.g.
gnutella and freenet and layed over the Internet.
> * Short description of gzz and it's relationship to Storm
this all must be updated to the current Fenfire status
> Sequenced Comments
> ------------------
> Thoughts and questions that occurred to me as I read.
ok i must go now - but thanks again and we'll certainly keep in touch (has
been also nice to see the development on ba-unrev and Benja's
participation there, too)
~Toni
>
> Abstract
> * Very cool. location-independent globally unique identifiers,
> append-and-delete only storage, and peer-to-peer networking.
> very, very cool.
>
> Intro
> * Wow. 8 references to systems that implement structured overlay
> networks. I had no idea there were so many.
>
> * 51 references in all, mostly in journals, to some *great*
> work solving problems of data sharing, granular addressing,
> linking, and versioning.
>
> * The two major issues addressed are mentioned here: dangling
> (unresolved) links and keeping track of alternative versions.
> These deserve to be mentioned in the abstract.
>
> Related Work
> * It's not totally clear what the relationship of the related
> work is to the current project. Do the systems described
> represent old work you've moved beyond, old work that
> provided useful lessons (what lessons?), a foundation for
> the current work (what parts?), predecessors or clients of
> the current work.
>
> * Mention gzz here, and it's relationship to Storm (i.e. gzz
> refactored to create Storm as an independent module.)
>
> Peer-to-Peer Systems
> * Mentions a proposal for a common API usable by DHT systems,
> but it's not clear if you plan to build on that, or if it
> is a rival, or a predecessor.
>
> * Mentions "Xanalogical storage", but assumes we know what it
> is. (Needs a short description. Ok to do a forward reference
> to where it is discussed later in the article.)
>
> * Hmmm. Probabilistic access seems reasonable for "contact"
> scenarios (bunch of people together at a meeting), but not
> for "publishing" scenarios (publish document on the web).
> May be worth drawing the distinction here.
>
> Overview of Xanalogical Storage
> * This threw me. A minute ago we were talking about blocks,
> now we're talking about characters. Needs a transition to
> make the relationship apparent. (Later, you talk about
> spans. Those may be precursors to blocks or they really are
> blocks. I'm not sure which. Need to anticipate that thought
> somehow, and tell how we're building up to it, if that's
> what's going on.
>
> * Yeah. There's the paragraphs on spans. That threw me, too.
> Suddenly I had gone from blocks to characters and now to
> spans, and I was pretty confused about how they related.
>
> * "Our current implementation" has me wondering what we're
> talking about. At this point, I thought this more "Related
> work", like "peer to peer systems". But now it seems it's
> all one system? Or was this a previous system, before you
> started working on Storm? (Need to make the relationships
> apparent.)
>
> Storm Block Storage
> * Now were back to blocks. Why did that last section exist,
> anyway? (make the relationship apparent)
>
> * "caching becomes trivial, because it is never necessary to
> to check for new versions of blocks". Hmm. This sounds like
> versioning isn't supported, which seems like a weakness.
>
> * Interesting. There is a need for "anonymous caching". That
> allows replication, while resolving the privacy concern.
>
> * A block is hashed. Ok. And a doc contains pointers to blocks.
> Ok. But is a doc a block? How is it hashed? How do links
> contribute to the hash?
>
> * Gzz is first mentioned here. It needs to be described earlier
> in the Xanalogical addressing section.
>
> * "Storm was first developed for the Gzz application, a platform
> explicitly developed to overcome the limitations of traditional
> file-based applications" -- a *very* intriguing statement.
> When Gzz is introduced, this statement needs to be expanded to
> provide a short list of those limitations, and what Gzz did to
> solve them. (It has to be very short, of course -- no mean feat.)
>
> Implementation
> * "we have not yet put a p2p-based implementation into use"
> This paragraph is very nicely stated. You've done so much
> already, no can blame you if this part is missing! But it
> was very good of you to point it out. You do that same kind
> of thing elsewhere, as well. Very nice.
>
> * "UI conventions for listing, moving, and deleting blocks"
> I don't know. That sounds wrong to me. Blocks should be
> under the covers, and I should be dealing with docs. Ok,
> so I have an outline-browser (for example, ADM's) or a
> similar editor. Internally, blocks are moved around when I
> edit. But my access is always thru a "Doc" -- otherwise I'll
> be looking at blocks that are divorced from any context whatever.
>
> Application-Specific Reverse-Indexing
> * This lost me pretty quickly. I wasn't sure what the purpose
> of this section was. I needed a use case or two to keep me
> oriented. Later, it becomes clear that this is
> a part of the versioning solution. Mention that fact here.
> If possible, also give one or more examples of the other
> indexing systems you created, to show what this section is
> for.
>
> * "locally, is guaranteed that all blocks are indexed by all
> applications known by the pool".
> --This paragraph should come before the previous one, which
> discusses the networked implementation, where not all
> applications may have stuff indexed (at which point I said,
> huh?)
> --More importantly, I really needed an example of an application
> or two so I could follow this. What does it mean if an
> networked app doesn't have an index? I just wasn't getting
> it. (It sure sounded like that wouldn't be good, but I
> don't know for sure.)
>
> * keyword searching
> --it seemed to me that a keyword index would return every
> *version* of a block that contained the word, which would
> be a real weakness.
> --(maybe versioning needs to be described first, so you can
> discuss the indexing process in context, and mention the
> resolutions for such issues?)
>
> Versioning
> * Aha! I read the paper over several days, and so much water
> went under the dam that I had forgotten this was mentioned
> at the beginning of the paper.
>
> * "if, on the other hand, two people collaborate..."
> VERY nice. Multiple "current version"s are allowed to exist.
> That's the only possible way to handle the situation.
>
> * Note 6:
> It wasn't clear to me how it knows which pointer blocks are
> obsolete.
>
> * Beautiful statement of points for further research
> (authenticating pointer blocks, UI for choosing alternative
> versions, suitability for web-like publishing). But the
> system looks strong enough to make me *want* to do such
> experimentation
>
> Diffs
> * It wasn't clear if the most recent version was "intact" and
> previous versions were stored as diffs. I would hope so,
> in general. At least, if there was only one option, that's
> the one I'd want. Or can you do it either way?
>
> * "We always check that we can reconstruct the original version"
> Very nice.
>
> Discussion
> * Yes. This is the point of the article. Dangling links and
> version handling. Definitely belongs in the abstract.
>
> * Impact of immutable blocks on media use needs a mention
> here. (Maybe just hand-waving, but some mention of the
> fact that it's going to cost disk space, in return for
> improved ability to do xyz, is needed.)
>
> Conclusions
> * Wild. A Zope based on Storm. Or an OHP.
> --what's an OHP, anyway. (needs a one-line definition)
> --come to think of it, I recognize Zope, but not everyone
> will. That needs a one-line explanation, as well.
>
> * "structured overlay networks such as DHTs"
> --I need another paper describing these things, so I can
> find what they heck they are and how they work!
>
> References
> * Excellent. Thanks.
>
> Bottom Line
> -----------
> An excellent read, and a most promising technology.
> Thanks for sending it to me.
>