[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
network costs of history copying (was Re: [Monotone-devel] Monotone serv
network costs of history copying (was Re: [Monotone-devel] Monotone server unreachable)
Tue, 17 Jan 2006 21:16:00 -0800
On Mon, Jan 16, 2006 at 11:15:38PM +0300, Petr Ovtchenkov wrote:
> The size of new pull is near the 70-80M, isn't it? That's really
> high barrier for new member of any project, based on monotone.
> Do you have recommendation for such use-case?
I just did a pull of net.venge.monotone alone, to test. (Some of
the contrib branches contain some rather large stuff, like a copy of a
bunch of big 3rd-party java libraries, so I left those out to get a
fair test). Netsync transferred 25.6 megabytes. By comparison, a
fresh checkout is 14 megabytes, and a gzip compression level 3
compression of the fresh checkout is 4.5 megabytes -- this should be
very roughly comparable to the network traffic needed to do a cvs
checkout, assuming cvs's network protocol has no overhead, which may
or may not be the case...
So, for monotone's own tree ATM, the upfront cost in network usage is
equivalent to doing somewhere between 2 and 5 CVS checkouts. This
doesn't seem like a very high barrier to me, though of course this
depends on one's own thresholds for highness (and those of one's
It's also worth noting that you never need to pay this cost again,
whereas with a system like CVS every time you want another working dir
to work on a branch or the like, you have to do a new checkout from
Unfortunately, these numbers only get worse, not better, as a project
lives on. The numbers above are for 3.3k revisions over almost 2.5
years, so not a trivial history, but there are certainly projects out
there with _much_ higher change rates and/or longer histories. And
eventually, you really really need a way to avoid that full download.
This is what we call "partial pull" in discussions; we've known
forever that we need it eventually.
The thing is, though, it's a bit tricky to implement in a way that
will be consistent (though rosters might make it possible to get past
the worst hurdle), and the projects that are big enough to really need
partial pull mostly aren't going to consider monotone until we also do
a bunch of other things that are more generally useful.
So that's why this hasn't happened yet. Just trying to allocate
the resources we have in the right order -- and if someone wants to
step up and work on this, go for it, we'll be happy to help :-).
When the flush of a new-born sun fell first on Eden's green and gold,
Our father Adam sat under the Tree and scratched with a stick in the mould;
And the first rude sketch that the world had seen was joy to his mighty heart,
Till the Devil whispered behind the leaves, "It's pretty, but is it Art?"
-- The Conundrum of the Workshops, Rudyard Kipling