[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Gnu-arch-users] Arch Versus CVS Versus Subversoin

From: John A Meinel
Subject: Re: [Gnu-arch-users] Arch Versus CVS Versus Subversoin
Date: Sun, 05 Dec 2004 19:27:37 -0600
User-agent: Mozilla Thunderbird 0.9 (Windows/20041103)

Andrew Suffield wrote:
On Sun, Dec 05, 2004 at 02:59:32PM -0600, John A Meinel wrote:

Arch does not support binary diff in the subversion sense.

Nonsense, it works just fine.

No. In subversion, if you have a binary file, and you make changes, it only stores the changes to the file. (something like xdelta). So if you have a binary file that is 100k long, and you append 10k to it, the changeset that is stored in the archive is only ~10k in size. (I think the same is true if you modify only 10k of the file.)

However, with arch, if a file is declared binary and it is considered to be changed, then a completely new copy of the file is added to the repository. Actually, I just checked, and 2 new copies are added to the repository, the old version, and the new one.

Here are the steps to reproduce:

$ mkdir binary
$ cd binary
$ tla init-tree binary--dev--0
$ cp ~/image.png .
$ tla add image.png
$ tla import -S
$ dd if=/dev/zero bs=1000 count=10 >> image.png
$ tla changes
Mb image.png
$ tla commit -s "modified the binary."
$ tar xvzf $ARCHIVE/binary/binary--dev/binary--dev--0/patch-1/binary--dev--0--patch-1.patches.tar.gz
$ cd binary--dev--0--patch-1.patches/patches
$ ll
-rw-r--r--    1 jfmeinel users       29632 Dec  5 19:16 image.png.modified
-rw-r--r--    1 jfmeinel users       19632 Dec  5 19:16 image.png.original

This is most definitely NOT what the original poster was meaning by "binary diff".

Arch handles binary files, it does very well with them. But it *does not* just store the differences. It actually stores 2 whole copies for any change, just so you can compute the difference yourself if you so choose.

In general, to do appropriate binary diffs, you must know the file format, and have an intelligent diff program.

That's not a binary diff. That's some other format-specific diff.

I think the reason people want binary diff (it's why I wanted it) was to decrease the amount of space used in the archive.

Compression is not related in any way.

compression is not binary diff. True. But the reason people want binary diff is so that when you have your 10MB Word document, and you change 2 lines in it, when you do the commit it doesn't have to upload 10MB of data. In arch's case, it actually will upload about 20MB of data. (It will compress first, but that is a different subject.)

However, since arch does compression, you should get pretty good compression. In fact, if your files are small enough, you should get great compression if one is mostly just a copy of the other one.


Attachment: signature.asc
Description: OpenPGP digital signature

reply via email to

[Prev in Thread] Current Thread [Next in Thread]