[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Monotone-devel] roster handling speedups

From: Timothy Brownawell
Subject: [Monotone-devel] roster handling speedups
Date: Fri, 12 Feb 2010 22:09:20 -0600
User-agent: Mozilla-Thunderbird (X11/20091109)

I've been doing various things to make pulls faster, here's a brief overview of the recent changes, and some other thoughts further down:

 * the roster cache has a minimum size of 3
      At least 2 cache slots are required for the cache to be at
      all useful, and expanding it to 3 reduces the number of
      misses by another 75% or more. Further expansions have much
      smaller effects.
 * node_map is now a COW type, and shared nodes are copied on write
      A large portion of our time was spent copying rosters. Also,
      roster_merge can take advantage of this to do a little less work.
 * marking_map is now a COW type, and shared markings are also cow
      Another large portion of time was spent copying marking_maps.
 * make_roster_delta_t can take a changeset as a hint for non-merges
      If you've only touched 5 nodes out of 10,000, making a delta
      is much faster if you can just look at those 5 nodes instead
      of having to scan all 10,000 for differences. This doesn't work
      for merge revisions, because for example the node in
          a*  a*
           \ /
      will not be part of the cset because it still has the same value,
      but the marking in the child will be different that the marking
      in either parent.
 * put_roster_for_revision() doesn't do redundant sanity checking
   This would call make_roster_for_revision(), which returns a
   sanity-checked roster. Then it would call calculate_ident() on that
   roster, which would sanity-check it again. Now calculate_ident()
   takes a flag to not do the sanity checking.
 * misc speedups
   - roster_t::check_sane() was doing 2 loops, now it only does one
   - is_{dir,file}_t now have a flag to check, instead of calling
     dynamic_cast and checking the result
   - dfs_iter:advance_top() had multiple top() calls, that g++ couldn't
     optimize into a single call due to being non-const. Amazingly
     enough, this is called enough to actually show up on profiles.
   - roster_t::print_to() is hand-coded, instead of calling basic_io
 * roster cache instrumentation
   The roster cache can now log hit/miss information to a file, if given
   the --roster-cache-performance-log=<file> option. This was to help
   figure out what size to use for the first item here.

There's also a branch net.venge.monotone.tbrownaw.pull-reduced-sanity with a new option --sanity-check-percent=<0..100> taken by pull and clone. This makes it run check_sane() and verify the manifest_id on only that percent of incoming rosters, instead of all of them. I'm guessing we don't want this in mainline.

There are some further possibilities that I'm not pursuing right now:

 * There's a lot of time spent in encode_hexenc, while printing
   manifests. Perhaps these could be stored instead of being always
   recomputed. Maybe file_node could remember its hexified id, or maybe
   doing something with id::symtab and the vocab macros and then
   teaching the hexenc/hexdec functions about it would work.

 * roster_t::check_sane() checks the following:
   - root_dir is set (fast)
   - there are no loop in the node tree and
   - all nodes are attached (walking the tree takes nodes.size() steps)
   - node names match what their parents call then
   - file nodes have a content hash
   - attr (live?, value) pairs make sense
   - nodes know their ID

   Which leads to the following ideas:

   + Perhaps attrs could be hidden behind get/set/clear functions that
     enforce (live?, value) sanity, so it doesn't have to be checked?
   + Perhaps the node's parent ID and dir_node's child map could be
     hidden similarly, so that the parent/child name matching doesn't
     have to be checked?
   + print_to() needs to walk the tree, so we could make it it verify
     that all nodes are attached and there aren't any loops. Is
     check_sane() called in places we don't also print (which includes
     anything that checks the manifest_id)?
   + print_to() prints the file hash, so it can just check that it
     isn't empty at the same time
   + limit the number of places that nodes are inserted into the
     ndoe_map or node::self is touched (and make it private) to one or
     two functions, so that this doesn't need to be checked

   Basically, rewrite things into encapsulated "obviously no bugs" form,
   and then fold the remaining necessary runtime checks into code that
   has to look at the stuff being checked anyhow.

 * The roster cache doesn't respect the COW. If there were a good,
   *fast* (don't just scan every node of every roster) way to figure out
   the actual space usage, it could know to hold many more rosters than
   it currently does. Maybe just a global count of the number of
   existing nodes would be enough? How many non-cached rosters do we
   actually generally hold while putting things into the cache (and
   remember that copies of things in or going into the cache don't
   count, because of the COW)? This would require changing how the
   size estimator works for rosters, while not changing how the size
   estimator for the delta cache works.

 * rosters and nodes (and marking_maps and markings) have a cow_version
   field, that increments on new/copied rosters (and marking_maps), and
   help identify when a node (or marking) is not shared. Perhaps it
   could be handled in such a way that when doing a merge, a
   multimap<cow_version, node_id> or similar could be scanned forward
   from the lower of the parent's cow_versions and identify all nodes
   *and markings* that need to be touched by the merge (rather than
   being copied from either parent arbitrarily). Perhaps this could also
   be used in the roster_delta code. In the common case that the rosters
   being merged are some of the most recent ones, this would work very
   well. In the case that one is rather old, it would probably still
   work well since many files are more-or-less never touched.

reply via email to

[Prev in Thread] Current Thread [Next in Thread]