[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Monotone-devel] url schemes
From: |
Markus Schiltknecht |
Subject: |
Re: [Monotone-devel] url schemes |
Date: |
Sun, 23 Mar 2008 13:09:00 +0100 |
User-agent: |
Mozilla-Thunderbird 2.0.0.9 (X11/20080109) |
Hello Derek,
first of all: nice work in nuskool! Thanks for ripping out my silly
code, which re-implemented a kind of toposort. Dunno what I was thinking
there...
[ A small side note: I'd have had an easier life reading your cool
patches, if you committed whitespace changes separately. ]
Derek Scherger wrote:
Markus Schiltknecht wrote:
Then, there are the planned nuskool commands. Those are currently
encoded entirely in JSON. The HTTP client requests the same URL every
time, and encodes the query in JSON. ATM nuskool doesn't support
branch inclusion or exclusion patterns. The commands currently are:
* inquiring revisions: asks the server if it has certain revisions
* getting descendants: querying the ancestry map of the server
* getting (pulling) a revision
* putting (pushing) a revision
* getting file data
* putting file data
* getting file delta
* putting file delta
I'm not convinced they ought to stay this way though. That's just where
I ended up after picking up what graydon had started and I don't know
whether I went in the direction he had in mind or not. I have the
feeling that it's a bit verbose, but I'm not sure what to do about that
yet.
Too Verbose, maybe. But also very simple to understand.
On the bright side, I have managed to pull files and revisions from my
monotone database using the nuskool branch (which doesn't yet pull certs
or keys or care about branch epochs but does basically seem to work). It
is rather slow at the moment (71 minutes vs 25 minutes with netsync,
which *does* pull certs, keys etc.). I haven't done any profiling yet
but I would expect two things to show up.
Uh.. that is the time to pull the complete net.venge.monotone
repository, right? While that certainly sounds awful, let me point out
that that's not the case where nuskool is supposed to be the winner.
It's rather optimized for subsequent pulls and it's already faster than
netsync there:
# time mtn -k address@hidden pull -d ../test.db nabagan.bluegap.ch
"net.venge.monotone*"
mtn: connecting to nabagan.bluegap.ch
mtn: finding items to synchronize:
mtn: certificates | keys | revisions
mtn: 39,427 | 54 | 12,997
mtn: bytes in | bytes out | certs in | revs in
mtn: 1.6 k | 1.9 k | 0/0 | 0/0
mtn: successful exchange with nabagan.bluegap.ch
mtn -k address@hidden pull -d ../test.db nabagan.bluegap.ch 4.82s
user 0.12s system 28% cpu 17.386 total
# time ./mtn gsync -d ../test.db http://nabagan.bluegap.ch:8080/monotone/
mtn: 13,850 common revisions
mtn: 130 frontier revisions
mtn: 0 outbound revisions
mtn: 0 inbound revisions
./mtn gsync -d ../test.db http://nabagan.bluegap.ch:8080/monotone/
1.48s user 0.13s system 38% cpu 4.172 total
(Avg ping time from here to nabagan.bluegap.ch is ~60 ms)
(Agreed, that's not a fair comparison either, because gsync doesn't pull
certs.)
(1) printing/parsing basic_io has come up in the past and nuskool adds
very similar printing/parsing json_io so it will probably double the
printing/parsing time.
That applies to the current http channel. Other channels might or might
not use JSON. Or maybe we even want to add different content-types for
http, i.e. return json or raw binary, depending the http accept header.
(2) it's currently very granular, request one revision, receive one
revision, then for all files changed in the revision request one file
data or delta, receive one file data or delta, etc. until all the
content for the revision has been received, then move on to the next
revision. latency of request/response times is probably a big factor.
Agreed. However, merging multiple get requests for a single resource
into one multiplex request is just one option to solve that problem.
Another one would be running multiple queries in parallel. Dunno how
feasible that is, though.
(Using threads could also help hash calculation... considering our
commodity hardware boxes are getting more and more cores per box, that
might be worth it in the long run).
Plus: having that simplicity would allow us to handle dumb servers
pretty equally.
I went with the fine-grained get/put request/response pairs so that
neither side would end up having to hold too many files in memory at any
one time. If we instead requested all file data/deltas for one rev the
number of round trips would be reduced but we'd end up having to hold at
least one copy (probably more) of the works in memory which didn't seem
so good. I'm open to suggestions. ;)
I don't think files necessarily need to be put together by revision -
that would be a rather useless collection for small changes. Instead, we
should be able to collect any number of files together - and defer
writing the revision until we have all of them.
In terms of the printing/parsing, Zack mentioned a while ago the idea of
a binary encoding for revs and I had been thinking along the same lines.
A very simple to read/write serialization would be good. I'm not sure if
the json form has any real benefit or not, whether arbitrary web clients
would be interested in the rev formats, etc. or whether a simple binary
form would be better.
I certainly think of JSON as a good exchange format. It doesn't only
help JavaScript, but provides a good mixture between well structured
data (think XML) and raw binary data. It provides some structure, but
it's not overly verbose. And it's easily usable from pretty much any
scripting language.
However, one of the downsides of JSON is: it cannot encode binary data.
Or more precisely: strings are interpreted as UTF-8 encoded, so you
better don't write binary data in there.
Thus, JSON and binary encoding for revs don't seem to mix well here. As
much as I like binary encoded stuff for internal things, I also like to
be able to read the revision's contents.
Once again, this makes me think about using the revisions solely for
synchronization, and not storing them in the database, but use (binary)
rosters instead.
Regards
Markus
- Re: [Monotone-devel] url schemes, (continued)
- Re: [Monotone-devel] url schemes, Markus Schiltknecht, 2008/03/29
- Re: [Monotone-devel] url schemes, Thomas Moschny, 2008/03/29
- Re: [Monotone-devel] url schemes, Markus Schiltknecht, 2008/03/29
- Re: [Monotone-devel] url schemes, Thomas Moschny, 2008/03/29
- Re: [Monotone-devel] url schemes, Markus Schiltknecht, 2008/03/29
- [Monotone-devel] Re: url schemes, Koen Kooi, 2008/03/29
- Re: [Monotone-devel] Re: url schemes, Markus Schiltknecht, 2008/03/29
- [Monotone-devel] Re: url schemes, Koen Kooi, 2008/03/29
- Re: [Monotone-devel] url schemes, Markus Schiltknecht, 2008/03/29
Re: [Monotone-devel] url schemes, Derek Scherger, 2008/03/22
- Re: [Monotone-devel] url schemes,
Markus Schiltknecht <=