monotone-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Monotone-devel] Merging suspend branch


From: William Uther
Subject: Re: [Monotone-devel] Merging suspend branch
Date: Sat, 14 Jul 2007 21:03:42 -0700


Zack Weinberg wrote:

On 7/14/07, William Uther <address@hidden> wrote:
I've been playing with the branch, and it does slow down listing
branches quite a bit.  Listing branches isn't too common an
operation, but I switched things up to try to help this:

   - The main 'suspend' cert change is now in
project_t::get_branch_heads().
   - I've added a new lua function: get_branch_heads() which gets the
branch heads :)
   - I've added a default ignore_branch() lua hook which checks if
there are any branch heads and
     ignores the branch if there are not.

The result is basically the same as before, except that you can
choose to switch off the checking for heads behaviour when listing
branches (by changing the lua hook), which makes makes quite a speed
difference (if you do this operation a lot).

I gotta say I'm not real enthusiastic about this hook.  If the only
reason anyone would ever change it is to speed up an operation at the
expense of clarity in the results, I'd argue that we should fix the
performance problem instead...  Can you think of any other reason
someone would change it?

The ignore_branch() hook already existed.  It just didn't have a default
implementation. I moved some of my old patch into a new default implementation.

I'm not particularly enthusiastic about the get_branch_heads() lua function though. If there are any errors in there, then it'll pop out as a lua failure but all the debug output gets lost.

I then started looking at why finding branch heads is so slow.  The
answer is simple: project_t::get_branch_heads() first gets every
revision in the branch, then erases ancestors.  Getting the heads of
every branch means considering every revision in the DB.  It might be
nice some time in the future to cache the heads of each branch in the db.

The information is already kinda-sorta in the database ... there's a
"revision_ancestry" table that just contains (parent, child) pairs.
You can pull out all revisions that are not parents of any other
revision with one SQL query:

  SELECT child FROM revision_ancestry WHERE child NOT IN (SELECT
parent FROM revision_ancestry)

-- on my local database, that executes in 0.2 seconds (including time
to format the results neatly and write them to /dev/null).

OOOoooo

That's a much better solution :)

I'll change things back the way they were with that in there...

hrm. We'll need to be a little trickier. If a head of a branch has a child in another branch then your query not return it, but it could still be a head.

I've never done a lot of SQL, but I think we want something like (and if someone were to clean this up, that would be good):

SELECT revision_certs.id FROM revision_certs, revision_ancestry // get the id of revs WHERE revision_certs.id = revision_ancestry.child // that are children in the ancestry table AND revision_certs.name = ? AND revision_certs.value = ? // and are in the branch we're looking for AND revision_ancestry.child NOT IN // and are not (SELECT revision_ancestry.parent FROM revision_ancestry, revision_certs WHERE // parents in the ancestry table revision_ancestry.child = revision_certs.id AND // where the corresponding child's branch revision_certs.name = ? AND revision_certs.value = ? // is the branch we're looking for
     )

Hrm. If you have a single revision in your DB, does it get entered in the ancestry table? Or do we need to extend this further.

And even that wont work, because you're not checking the validity of the certs. We may not be trusting certs by specific individuals, but we will still have removed revisions from the list of heads if an untrusted person has committed a child.

After the revs in a branch have been gathered, this is the function that does the work finding the heads: erase_ancestors_and_failures()

And it has this comment:

// this call is equivalent to running:
//   remove_if(candidates.begin(), candidates.end(), p);
//   erase_ancestors(candidates, app);
// however, by interleaving the two operations, it can in common cases make // many fewer calls to the predicate, which can be a significant speed win.
//
// FIXME: once we have heights, we should use them here. The strategy will
// be, to calculate the minimum height of all the candidates, and teach
// accumulate_ancestors to stop at a certain height, and teach the code that // removes items from the candidate set to occasionally rescan the candidate // set to get a new minimum height (perhaps, whenever we remove the minimum
// height candidate).

I don't understand heights. Should I leave things the way they were and wait for that FIXME to be fixed?

Cheers,

Will       :-}






reply via email to

[Prev in Thread] Current Thread [Next in Thread]