monotone-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Monotone-devel] [sqlite] disk locality (and delta storage)


From: Daniel Carosone
Subject: Re: [Monotone-devel] [sqlite] disk locality (and delta storage)
Date: Sat, 11 Feb 2006 20:25:51 +1100
User-agent: Mutt/1.4.2.1i

On Fri, Feb 10, 2006 at 07:06:27PM -0500, address@hidden wrote:
> Daniel Carosone <address@hidden> wrote:
> > 
> > It seems a little odd to me to build a centralised, online
> > information system for tracking state and documenting activity around
> > and about source code in a distributed and disconnected VCS.  
> 
> Ah yes, you're right.  But in the system I envision, the wiki
> and bug-tracking are also decentralized, disconnected, and
> distributed.

Now I'm *very* interested :)

> > Wiki pages doesn't seem so hard, they're pretty much text documents
> > stored in a VCS anyway. 
> 
> There are some complications.  Each wiki page acts if it where
> its own independent project or branch.  

Hm. I'm not sure I see that as the case; if I update several pages
together and change the linkage between them, I could well want a
single commit revision across the several topic pages.  But I've not
given it much thought..  it probably does work close to that way (a
commit per page update) when the editing is being done via html forms.

The wiki I'm most familiar with, in implementation, is Twiki (and even
then, only vaguely).  It's a web front-end onto RCS files that hold
the wiki text.  If you have access to the server, there's nothing
stopping you doing your own checkout/edit/checkin, for example if you
find the browser editing panel restrictive for what you need to do.

So I just figured wiki content in monotone might as well work pretty
much the same way, with the webserver as one committer.

> And then you probably want
> some way see all current leaves and to do merges from the web
> interface.  

Yes. A nice web-based merge UI is one thing you'd want, but it needn't
be there in the first instance.  You're going to need some rules and
mechanisms for when the webserver will "update" as new content arrives
in the database via netsync; in the first instance you could rely on
developers to merge and provide the webserver with a unique head.
Until that's resolved, the webserver and its users could just keep
committing along the line they started with.

If you start running multiple web servers as semi-independent mirrors
(not counting 'personal' ones, as below), then you get to the point of
needing to provide web users with more UI to deal with choosing and
merging heads.  Much of that UI is probably very much like the UI you
need to give them to navigate source content, viewmtn-style.

> If you intend your system to be publically accessible, then
> you will also need some mechanism to delete (or at least 
> permanently hide from public view) the spam that miscreants
> occasionally post on wiki pages and sometimes also on tickets.
> Some kind of "death cert" perhaps.

Yes, but I think this applies for monotone projects that accept
anonymous revisions, regardless of whether the content is a wiki page
or anything else.

> > Bug tracking state would require a little
> > more smarts, with suitable text formats stored in the VCS and
> > code/hooks to assist in making sane merges between multiple heads of a
> > given ticket, but even that doesn't seem terribly hard.
> 
> My thinking about tickets is that each change to a ticket
> is just a small, signed, immutable record (like a cert)
> that contains a variable number of name/value pairs.  The
> names are things like "title", "description", "status",
> "assigned-to", etc.  To reconstruct the current state of
> a ticket, you sort all the pieces by their creation time
> (their actual creation time, not the time they happened to
> arrive at your database) and read through them one by one.
> Values overwrite prior values with the same name.  So
> in the end, you have one value for each field name - and
> that is the current state of your ticket.

I wouldn't use creation time, datestamps are unreliable; even without
someone issuing malicious timestamps, there are sometimes good uses
for dating a revision in the past, or even possibly in the future.
(Ok, I can't think of a good case for future dates right now, but I
have quite valid ones for past-dating certs, as well as the cvs_import
case).

Instead, I'd evisaged using the revision DAG to let one state replace
a previous one (ie, topological sort in your algorithm above).  When
merging paths that have conflicting values, it's probably best for the
user to choose the correct one, and place that value on the merge
node.

By which time, you might as well just store the ticket as structured
content, perhaps with a commit-hook to validate well-formedness and
other constraints.

> This approach gives you automatic merging and a complete
> change history/audit trail. 

It does give you simpler merging, but perhaps too simple? "Overwiting"
values like this mightn't be the best thing in some cases.. and it
means people can go back and change history.

Using the DAG means that a bug can be closed on one branch, (or even
one divergent head/path within a branch) but still open on another.
Which seems exactly correct, to me.

In any case, just like for actual code, there will be some kinds of
information and state that should be *in* the DAG, as content, and
immutable once established, and other kinds that should be *on* the
DAG, and that you can go back up the tree later and add decoration and
further markup.  It's "just" a matter of deciding which is which for
each item.

> Tickets also benefit from having "remarks" that people can
> append to the ticket (without overwriting) and attachments.
> Both are handled by separate certs. 

We have comment certs now. Attachments might as well just be content
(files), though the cert could contain links in some form to that.

> It is also very useful to have certs that record a link 
> between a revision and a ticket.  In this way you can record 
> that a bug was introduced by, or fixed by, a particular 
> revision.  The linkage between tickets and revisions has 
> proven to be particularly valuable in CVSTrac.

Absolutely.  A number of projects have invented various ways of doing
the same thing.  In NetBSD, we can mention PR numbers in commit
messages, and the commit mail winds up in the gnats DB. RT tickets can
similarly recognise various forms and create web links.

I'd also like to see a system that encourages linking a bug with a
testresult, so testresult certs can be used to see the bug stay
closed.

> > It could still be a web ui if people find that comfortable, just one
> [...]
> Just type (for example):
> 
>    monotone httpserver &
> 
> and then point your webbrowser at 127.0.0.1.

My personal favourite is jetty in Java for this kind of thing; I'm not
sure monotone itself should grow a http server :) But yes, an embedded
webserver in a process you start up locally (rather than something
that has to run as a module or cgi in some other web server).

> I perceive that your thinking evolved as you composed your email.
> At the beginning you sounded skeptical.  But here toward the end
> you seem to be saying "hey, there might be some good ideas here".

Oh, no, I've been talking about these ideas for a while.  Any
skepticism at the start was only about the projects I've seen so far,
which seem to be missing the point.  I'm delighted to hear you're
heading just about exactly where I was hoping someone would!

--
Dan.

Attachment: pgpqZqXwwCvbM.pgp
Description: PGP signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]