[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Gnu-arch-users] Arch Versus CVS Versus Subversoin

From: John A Meinel
Subject: Re: [Gnu-arch-users] Arch Versus CVS Versus Subversoin
Date: Sun, 05 Dec 2004 14:59:32 -0600
User-agent: Mozilla Thunderbird 0.9 (Windows/20041103)

Tom Browder wrote:

One major thing that subversion offers that I haven't seen on the
comparisons between arch and other scm programs is the capability to version
control binary files with binary diffing.  This is an important feature for
me and I would like to use subversion.

However, and unfortunately, the subversion folks refuse to listen to my
CVS-user suggestions for the same type tags as rtag and use of an equivalent
to CVSROOT, otherwise I would jump on subversion.

Does arch offer or plan to offer those three features (CVS rtag equivalent,
binary diff, CVSROOT equivalent)?


Tom Browder
Arch does not support binary diff in the subversion sense. Because binary diff only makes sense if you have complete history. For each change you must have the bit-wise identical previous version. (binary diff only supports exact patching). One of the ideas of arch is because patch supports fuzzy patching, each revision actually has meaning by itself. Say you fixed a bug, and committed it as patch-23 into your archive. My archive is similar to yours, but I've made changes in the mean-time (even to the same files that you've changed). I can take your patch-23, and in the normal (non-conflicting) case, I can apply it to my tree (usually successfully) without knowing much else about your tree.

This is possible because the diffs generated contain context, and this context can be used for patching purposes. In the case of a binary diff, there isn't really the same idea of context. For instance, say the binary file is a bitmap, and I add a circle to the bottom left corner. You add a circle to the bottom right corner. Now the proper "diff" would be just that change, and if you applied your patch and my patch, you should end up with a circle in the bottom left, and the bottom right. But in reality the diffs would at the very least conflict. (In the case of a bitmap, data is stored row-wise, so you changed the same rows I changed). Things get even worse if you were dealing with something like png images, where the data is compressed. Because again, the proper 'diff' would have to know about the uncompressed image and what changed, so that patch could only change the altered pixels.

In general, to do appropriate binary diffs, you must know the file format, and have an intelligent diff program. The hardest part of this, is that everyone who would use your archive must also have the ability to understand the diffs, so that they can run their "patch" and get an exact copy of what you had.

I think the reason people want binary diff (it's why I wanted it) was to decrease the amount of space used in the archive. At first glance, it seems silly to keep 10 copies of a 100k file, when only part of it changes. However, afterwards it means that every one of your 10 revisions is independently useful. You don't need the previous revision to understand the next one. Also, unless your binary file is really big, and the changes are very localized, you don't really save that much space, and usually space is relatively cheap. It only costs $1/GB of space. $200 (less than a weeks salary for most people) gets you enough space to store lots and lots of copies of your files. Bandwidth isn't quite as cheap (and downloading a full copy of the file 20 times is probably expensive), but cacherevs can help prevent that. Arch also supports other enhancements (local archives, revision libraries), which trade off bandwidth versus space. And finally, all changes are stored compressed in the archive, so in many cases the compression will do better than a binary diff would.

Sorry this got so long, but there are specific reasons that arch doesn't support binary diffs, and it isn't just that it hasn't implemented them yet.

In response to rtag, arch's idea of tags vs branches, etc are considerably different. I believe you are trying to create a tag on a file without having a local copy. 'tla tag' does exactly this, but you are more likely to want to use a configuration, or something like that. It is very easy to create a meta-project that keeps track of any sort of tags, etc that you want for some other project (these are the configurations I speak of.) Then you get version-controlled meta-information. (In CVS when you move a tag, it is moved for good.)


PS> If anyone else wants to comment, please do, but this is my feeling about how arch operates.

Attachment: signature.asc
Description: OpenPGP digital signature

reply via email to

[Prev in Thread] Current Thread [Next in Thread]