monotone-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Monotone-devel] [patch] roster-deltas (40% pull speedup)


From: Nathaniel Smith
Subject: Re: [Monotone-devel] [patch] roster-deltas (40% pull speedup)
Date: Mon, 21 Aug 2006 18:15:41 -0700
User-agent: Mutt/1.5.12-2006-07-14

On Mon, Aug 21, 2006 at 03:40:21PM -0700, Eric Anderson wrote:
> Nathaniel,
>       If you're going to do this big change, it may also be worth
> switching over to binary rosters deltas as well, as in
> 464e510af4959231ff63352c902c689b0f1687aa; which measured a somewhat
> lower improvement for the pull (1.2x) than your 40% speedup, but also
> had the side effect of making annotate fast in most cases.  Since
> there is already all of the netio.hh stuff, binary formats are
> portable and about as easy as the text ones.

Interesting point.  I didn't do this already, just because I was
trying to make as small a change as possible (no, really...).  Now
that the regenerate_rosters machinery is there, it should be easy
enough to make the change later, so it might not be worth holding up
this landing on mainline to do it.

I would be curious how much they win at this point; the roster-delta
strategy for reconstruction changes that landscape, at least somewhat.
For instance, my understanding is that for annotate, the old code
does:
  0) load deltas from disk
  1) apply deltas to reconstruct roster text
  2) use sha1 to verify that roster text is uncorrupted
  3) parse that roster, to find out the content of the given file in
     this particular revision
And your changes let us (in this case) skip step 2, and make step 3
faster by only parsing out the relevant information.

With the roster-deltas branch, this process becomes
 0) load deltas from disk
 1) use sha1 to verify that the deltas are uncorrupted
 2) apply deltas to reconstruct roster directly
so the hashing cost is greatly reduced, and step 3 just disappears
entirely.

So it'd be interesting to profile annotate again with these changes; I
did a quick timing run, and annotate with roster-deltas did seem to be
2-4x faster than mainline.  If I had to guess, the bottlenecks now are
likely things like "copying rosters to insert them in the cache".  But
maybe parsing the deltas is a bottleneck too, dunno.

(There are interestingly different ways to optimize this stuff made
possibly by the roster-deltas -- since roster deltas record changes
directly by node_id, we could have an interface that traversed
roster-deltas directly, without reconstructing full rosters at all,
for the use of annotate and restricted log...)

-- Nathaniel

-- 
"...All of this suggests that if we wished to find a modern-day model
for British and American speech of the late eighteenth century, we could
probably do no better than Yosemite Sam."




reply via email to

[Prev in Thread] Current Thread [Next in Thread]