[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Gnu-arch-users] Arch Versus CVS Versus Subversoin

From: John A Meinel
Subject: Re: [Gnu-arch-users] Arch Versus CVS Versus Subversoin
Date: Sun, 05 Dec 2004 20:09:30 -0600
User-agent: Mozilla Thunderbird 0.9 (Windows/20041103)

Andrew Suffield wrote:
On Sun, Dec 05, 2004 at 07:27:37PM -0600, John A Meinel wrote:

Andrew Suffield wrote:

On Sun, Dec 05, 2004 at 02:59:32PM -0600, John A Meinel wrote:

Yes, arch supports binary diffs in exactly the same sense as subversion.

It doesn't do delta compression on them. That's irrelevant.

It is not irrelevant, people care about it.

xdelta is not a diff. It's delta compression. That's even in the *name*.

This is most definitely NOT what the original poster was meaning by "binary diff".

You mean they were out of their tree and using the wrong names for
things? You'll never get anywhere like that. "$foo diff" is an
operation which generates the input to "$foo patch", such that:

patch(A, diff(A, B)) == B, where A and B are of type $foo

That's the definition.

This is the first time that I've heard it (properly) called delta compression.

Compression is not related in any way.

compression is not binary diff. True. But the reason people want binary diff is so that when you have your 10MB Word document, and you change 2 lines in it, when you do the commit it doesn't have to upload 10MB of data.

No, the reason people want binary diffs is so that they can have
binary files stored in revision control. That's what it means and
that's what it does.

I do believe that what the original poster was asking for was "binary deltas". Yes we support "binary diff" as you name it. Your terminology is probably better than mine. But CVS has long supported binary files, what people didn't like is that every time they made a change, the size of their repository went up by the complete file size, instead of just the differences. The reason every one else calls it a binary diff is because a normal diff only contains the changes, your version of a binary diff contains the entire file pre and post.

Yes, I understand why it does that. And I'm not really advocating that it changes. I might argue that we would be better off storing the md5sum of the previous version rather than a complete copy of it, but generally that is irrelevant.

The only reason I can think of to use SVN is if I ever try to revision control a lot of binary files. Think most office document formats. MS and OOo. (ooo is zipped xml, but you are still committing the zip file which is binary)

Admittedly, I'm talking about a small population (about 10+ people), but when I was taught CVS I was warned that binary files are not "diffed" but store in complete form each time. (Obviously this should have been "delta compressed"). I have also heard this argument from independent sources (as mentioned, about 10 times). I realize all of us are using the wrong terminology here. But I know that is what they meant, because the complaint was about the size of the archive, and how much better SVN is because it only stores the changes.

See here:
Subversion stores all files in a binary representation and uses an efficient binary diff algorithm to compute differences between them. This means multiple revisions of binary files take up a lot less space on the server

"efficient binary diff" => delta compression.


Attachment: signature.asc
Description: OpenPGP digital signature

reply via email to

[Prev in Thread] Current Thread [Next in Thread]