[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Monotone-devel] Re: Support for binary files, scalability and Windows p

From: graydon hoare
Subject: [Monotone-devel] Re: Support for binary files, scalability and Windows port
Date: Fri, 16 Jan 2004 18:44:30 -0500
User-agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.6b) Gecko/20031205 Thunderbird/0.4

Asger Kunuk Ottar Alstrup wrote:

OK. The 16 MB limit is a showstopper for us, and probably the hashing as
well.  We are working with raw video files that are on the order of 100
MB big.

yeah. those are really not going to play well. not only hashing will be a problem; it is also the case that monotone sometimes keeps 1, 2, in rare cases 3 copies of a file in memory concurrently, in std::strings.

it is really made for source code control. if you need to use it for controlling very large individual files, it will need quite a bit of changing.

Do you have an impression of what the consequences of changing some
files to use automatically generated unique identifiers instead of
hashes, or partial hashes? In other words, how much of the code relies
on the fact that the id's are hashes of the complete file?

eh, I doubt anything intrinsically *relies* on it; the hash is pretty much treated as an opaque function from data->identifier. you could in theory change it to use any sort of hashing scheme: partial, smaller, faster, bigger, stronger. there are many to choose from.

Regarding, that is too big for a 5 minute inspection, so
do you know in more detail what changes are required?

Maybe the best bet is simply to try, and see what breaks?

yup. I don't actually know for sure that it's broken; my estimate was based on how much I could imagine any hypothetical breakage costing, not a detailed analysis. it may well already parse, but if it doesn't that's not going to take too long to fix. after that you probably just need to make sure it is decoded (uudecode?) and reconstructed properly (i.e. not using the ed-line-script thing textual RCS entries use).

Regarding big-file support: I was wondering whether that could be done
without changing sqlite: A big file can be defined as concatenations of
many smaller chunks. Even if you bump the limit to 2^32, it will still
fail for some users, so maybe it's better to come up with a scheme
without such a "low" limit?
I'm not sure how that fits into the architecture, though.

hmm. this comment, and your talk about network distribution, has given me a lot to chew on. sorry it's taken so long to respond. these are issues which I agonized over quite a bit near the beginning, and am still not completely happy with. your idea is good. I will proceed to talk out loud for a minute here.

currently, files are stored, as you have probably guessed, as head versions + xdeltas. the xdeltas reach into the past. head versions are identified by SHA1, deltas are identified by (pre,post) pairs of SHA1s -- the versions on either side of the delta. xdelta is itself based somewhat on hashing; common blocks are found by indexing all the blocks in the first file under their adler32 code, and rolling an adler32 window of the same block size over the second file one byte at a time. then it writes a copy/insert delta stream. when you make a change the forwards xdelta is queued to send to the network, and the reverse xdelta is stored in the database, moving the head version to your newly commited version.

this storage management system need not be the case. another feasible form would involve removing the file/delta distinction and storing all files as hash trees, where each hash code identifies an existing data fragment (up to a fixed block size) *or* a subtree.

this would in a sense "speed up" access to old files -- or at least make access to all files roughly equal speed -- and as you mention is would integrate nicely with a type of network distribution we don't currently have any real abstraction for: "synchronizing". syncing hash trees works pretty well. it would also easily permit astronomically huge files.

the difficulty with this sort of approach (well, everything related to the rsync algorithm really) is getting the constants right.

you have pressure from one side to make the extents you identify large: the fewer extents you write, the smaller your encoding, and the smaller your summary index of extents is (== less time to search) when you're doing a rolling adler32.

on the other hand, you have pressure from the other side to make the extents you identify small: finer grained extents means that you identify more commonality between blocks.

in theory you can speed up the adler32 (well, for whole-database indexing we would probably move to a larger adler code) by maintaining a medium-sized bloom filter of all the adler codes in your database. it's important to realize how fast xdelta needs to be able to disqualify an adler code: it checks *every byte offset* in second file. you commit a 64k file, it will want to check (64,000 - blocksize) adler codes.

How is monotone doing in this area? Full support requires reimplementing
rsync, but maybe monotone can reuse rsync or something?

you know xdelta is based on rsync, right? :))

yeah, I'm definitely curious to figure out a way to run monotone's networking the other way around. currently it is "replay-based". committing something queues it for transmission, depots replay transmissions, etc. this has some advantages -- you can just concatenate communication history for example -- but seemingly just as many disadvantages. it is fragile, order-sensitive, and requires a bunch of contortions to compensate for.

I'd be just as happy to scrap the entire existing networking system and move to a synchronization-based one, custom protocol or otherwise, if I could work out a really efficient abstraction. synchronization is self-stabilizing, which I'm very attracted to. maybe hash trees are it.

note that any such solution has to synchronize not only file contents but metadata collections. but maybe we can structure things in a way which supports that. for example, suppose we sort all manifests by attached date cert (or, heck, lexicographically) and cluster them into hash trees with a pleasant branching factor. then when I want to sync with you, I send you my "highest" summary hash-tree list (a few hundred manifests) and we sync from manifests to certs and files, then into the file content hash trees. it might well work.

It seems Zbynek Winkler more or less nailed that one today, so that is
good news.

not sure about that being nailed, but it's certainly making very encouraging progress. I'd really like to see a native windows port someday (both to shed cygwin, and to get a native GUI).

The next step would be to develop a TortoiseCVS/TortoiseSVN kind of
client, but that should not require any changes in monotone as such.

it'll probably require some refactoring of, but I'm not at all adverse to cleaning that up. it's a bit of a rat's nest right now.


reply via email to

[Prev in Thread] Current Thread [Next in Thread]