monotone-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Monotone-devel] Re: user friendly revision identifies


From: graydon hoare
Subject: [Monotone-devel] Re: user friendly revision identifies
Date: Fri, 03 Dec 2004 17:44:52 -0500
User-agent: Mozilla Thunderbird 0.8 (X11/20040913)

Emile Snyder wrote:

I have been (mostly) lurking on this list for awhile now, so I hope it
isn't too presumptuous of me to jump in with this... please take with
the grain of salt necessary for non code-accompanied ideas.

oh, please don't feel shy about contributing. it's a public project.
hopefully the fact that I respond here in the negative form will not put you off voicing future input. your point is certainly valid!

It seems to me that the desirable properties (from a user perspective)
for revision identifiers are:
* Unique across all repositories (hashes are excellent)

for your proposal, here is the crux of the problem: repositories have no more intrinsic identity than files or manifests. you can copy them (and often do, when making backups). any time you copy a repository, say "repository a", you now have two repositories which claim to be the same repository. you can try to differentiate the numbers they issue by using timestamps, or time + physical device + inode, or time + GUID + IP, or some combination thereof, but that generally means a fair bit new code, both to implement the correct case and to try to recover from failures in the incorrect case. this is the same issue with externally identifying manifests or files (as many people tell me I'm crazy not to do..)

I guess there's a fundamental issue here I don't address enough: from my perspective, bugs in implementation are *the most prevalent* form of failure this system has to put up with, so when someone discusses a scheme involving any form of external identifiers (rather than hashes) my first thought is to consider how much more (or less) code will be involved. usually it's more: code to manage extra tables and extra relationships between these tables and existing ones. the code to implement hashing is actually quite small and easily tested.

unless the amount of code is likely to be less than using hashes, or the failure rate or user-unfriendliness associated with current practise shown to be significantly worse than the bugginess of new code we're considering, I tend to not want the new code. crashing or losing versions is, after all, very user-unfriendly behavior.

(note for example that netsync and revisions deleted almost as much buggy code as they added, and were motivated by very serious real-world failures)

-graydon




reply via email to

[Prev in Thread] Current Thread [Next in Thread]