[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Monotone-devel] Re: partial pull #3 - calling conventions
From: |
Matt Johnston |
Subject: |
Re: [Monotone-devel] Re: partial pull #3 - calling conventions |
Date: |
Sun, 27 May 2007 19:22:48 +0800 |
User-agent: |
Mutt/1.5.13 (2006-08-11) |
On Sat, May 26, 2007 at 10:22:54PM +0200, Christian Ohler wrote:
> Lapo Luchini, 2007-05-26:
>
> >Space is not the scarce resource here (well, not the most important one,
> >at least, IMHO): time is.
> >Pull time is not only a question of size, it's also (mainly?) a question
> >of the time taken by the multiple hash and signature verifications.
>
> Ok. Still, verifying signatures on 10MB worth of data is very likely
> faster than verifying them on 71MB worth of data.
>
> Assuming that cryptographic verification is what's really taking too
> long at pull time, maybe we shouldn't be doing it at pull time?
> Wouldn't it be possible to defer the verification of each file's hash,
> each revision's id, each cert's signature etc. until the respective item
> is accessed for the first time?
It's not the SHA1 or RSA verification that is slow. From a
profile [1] of a local pull of n.v.m* taking ~22 mins wall
time (~14 mins user time):
- Around 1min30 is spent in sha160 hashing. Half of that is
just checking that the reconstructed file versions that
get pulled out of the database match what was expected.
- zlib inflate/deflate doesn't show up that much (40 secs
total perhaps?)
- A couple of minutes seem to be spent on verifying and
writing out file data/deltas (the delayed write cache
seems pretty efficient).
- 12 minutes is spent in put_revision(), mostly in roster
construction (put_roster_for_revision()).
- get_uncommon_ancestors() takes ~3 minutes. It's mostly
to do with traversing up long-lived diversions such as
n.v.m.cvssync* and *.select-heads-of (I think).
- ~2 minutes are spent writing out the full manifest
data of every revision, so that we can check that its
hash matches that specified in the revision.
- Lots of time seems spent destroying std::maps in
dir_node and other little memory operations like that.
(this may be OS-memory-allocation-dependent - profile
it yourself)
- Do RSA signatures even get checked in a pull? I can't see
them.
My conclusion is that removing file data will be useful for
people with slow network connections, but not for speeding
up netsync generally. Optimisation of the revision->roster
operations seems like it would be fairly beneficial. They
do count as "consistency checking", but not in the
cryptographic sense. I'm not really sure how the
revision->roster checking could be delayed at pull time,
since the current head revision (most likely to be of
interest) depends on all the previous revision that have
been received.
I'm also curious why the pull process was only active for
2/3 of the pull time - possibly the netsync protocol could
be pipelined better.
Matt
[1] 1.83ghz core duo macbook, Mac OS X 10.4.9, profiled
using Shark statisticaly sampling every 60 ms.
- [Monotone-devel] partial pull #3 - calling conventions, Markus Schiltknecht, 2007/05/25
- Re: [Monotone-devel] partial pull #3 - calling conventions, Thomas Keller, 2007/05/25
- Re: [Monotone-devel] partial pull #3 - calling conventions, Christian Ohler, 2007/05/26
- [Monotone-devel] Re: partial pull #3 - calling conventions, Lapo Luchini, 2007/05/26
- Re: [Monotone-devel] Re: partial pull #3 - calling conventions, Christian Ohler, 2007/05/26
- [Monotone-devel] Re: partial pull #3 - calling conventions, Lapo Luchini, 2007/05/26
- Re: [Monotone-devel] Re: partial pull #3 - calling conventions,
Matt Johnston <=
- Re: [Monotone-devel] Re: partial pull #3 - calling conventions, Markus Schiltknecht, 2007/05/28
- [Monotone-devel] Re: partial pull #3 - calling conventions, Lapo Luchini, 2007/05/28
- Re: [Monotone-devel] Re: partial pull #3 - calling conventions, Markus Schiltknecht, 2007/05/29