[Monotone-devel] Re: Support for binary files, scalability and Windows p

monotone-devel

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Monotone-devel] Re: Support for binary files, scalability and Windows p

From:	graydon hoare
Subject:	[Monotone-devel] Re: Support for binary files, scalability and Windows port
Date:	Mon, 19 Jan 2004 11:42:10 -0500
User-agent:	Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.6b) Gecko/20031205 Thunderbird/0.4

Asger Kunuk Ottar Alstrup wrote:

So, in summary: I appreciate the time you are spending on monotone, and
I appreciate the friendly and open tone with which you approach this
discussion.

hey, so long as we can keep the discussion in terms of "finding asatisfactory compromise" rather than "forcing one party to admit he'swrong", I'm sure we'll get along fine. I only mentionned a code fork asan afterthought, in the sense of "if you find that I am beingintolerably unreasonable, there's always this route..." :)

This is because I do not think this use case has priority over the use
case where I revert a change, but another party does not.

ah, right. again, my background concerns are all about source code, inwhich the concern is to make merging robust and painless (well, anddoing strong QA, which also benefits from primacy of hashedidentifiers), so I very much feel that it takes priority.

in any case, reversion of a manifest can be represented as a back-edgein the ancestry version graph or a cancellation of the forward edge (andnote: you'd have to revert the entire manifest, not just a file, becauseonly manifests are chained together in a history graph).

So, my suggestion is to separate concerns: Identify each version of a
file with a truly unique identifier in the version DAG, and then have a
separate scheme for representing each version of a file in a compact and
efficient way.

hmm. I read this as an elaboration of what you had in mind already:making identifiers into semi-random UUIDs which are mapped to hashes,rather than equal to hashes. while logically *possible*, I still don'tsee any to do this without the associated costs:


   - all operations on an "id" (comparison, i/o, synchronization) go
     via an indirection table which associates hashes with UUIDs

   - writing code to construct this table, synchronize it, evaluate its
     trust, etc. and modifying all operations to use it is a
     considerable amount of work

   - this table is a new point of failure in the system, and a new
     vector for attacks which play with trust relationships

   - the intrinsic integrity-checking associated with frequent hashing
     is lost

these, to me, are heavy costs. let me present an alternative which, frommy perspective, shifts the workload to the user who has this (imounusual) need to version-control large video files without ever hashingor merging them:


     do version control on directories full of small text files which
     contain nothing but the UUID of a video file, or better yet a URL.
     make a persistent attribute in .mt-attrs which treats each file as
     a request to have the associated large file transferred from a
     video server via wget and stored in a bucket of video files in
     ~/.huge-video-files, then symlinked into place in your
     configuration tree.

in this "solution" to the problem, monotone is not changed at all. it isdoing what it was built to do. you can still use it to manageconfigurations of your data, evaluate the trustworthyness of variousconfigurations, checkpoint and restore them, trade them with friends,etc. but you have decided that your data are *so big* that loading theminto memory and hashing them are too high costs (not to mention doingcommon subsequence searches on the bits during storage, or pointlesslygzipping them in-memory, or doing all transmission via base64), and thebenefit of being able to merge video files is an insufficient benefit.so you moved the files themselves out of monotone's storage management.


-graydon

[Prev in Thread]

Current Thread

[Next in Thread]

Re: [Monotone-devel] Re: Support for binary files, scalability and Windows port, (continued)

Prev by Date: [Monotone-devel] RE: Support for binary files, scalability and Windows port
Next by Date: Re: [Monotone-devel] Re: Support for binary files, scalability and Windows port
Previous by thread: [Monotone-devel] RE: Support for binary files, scalability and Windows port
Next by thread: [Monotone-devel] RE: Support for binary files, scalability and Windows port
Index(es):
- Date
- Thread