[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Gnu-arch-users] Some issues

From: James Blackwell
Subject: Re: [Gnu-arch-users] Some issues
Date: Thu, 10 Jun 2004 14:10:23 -0400

> On Wed, 2004-06-09 at 10:03, Florian Weimer wrote:
>>       * The changeset format is defined relative to GNU patch and GNU
>>         tar. These data formats are still somewhat in flux.

Colin Waters wrote:
> Can you back this up?  I have never heard of any problems.

I can come up with two examples. 

GNU diff and patch are required, so that the BSD diffutils are no longer
sufficient for arch. This evolving has affected arch, which has decided
(rightly) to follow the GNU path of development.

Also, we require a relatively recent version of GNU diffutils. I believe
that arch doesn't work with the GNU diffutils as packaged by the woody
version of Debian/GNU

Granted, this evolution is relatively minor and slow paced. However, gnu
arch is very exacting on how diffutils perform. This means that even
minor differences between versions of diffutils present a go/no-go
situation for gnu arch.

> On Wed, 2004-06-09 at 10:03, Florian Weimer wrote:
>>         The changeset format does not handle binaries efficiently,=20
Colin Waters wrote:
> The changeset format supports them as efficiently as possible.=20
> Changesets are intended to be used like patches are - i.e. you can send
> and retrieve them as self-contained entities.

This is incorrect, and I know so because Tom and I have discussed it out
of band. 

There are two related reasons why arch isn't doing a better job with
binary diffs. First, adding in binary diff support is actually a
difficult job to accomplish. Second, the inclusion of binaries in an
I believe that Tom thinks that the inclusion of binary files is not a
common enough event to warrent the large amount of work to make binary
diffs worth the effort.

I see things both ways. I'm currently working on a small SDL game that
uses BMPs to render images. Every time I change the background image, I
have to store a fresh copy of the 1.4 megabyte background. This of
course annoys me as much as it annoys anybody else. 

Sometimes, I get annoyed enough that I start poking around in the code
to implement binary diffing. Inevitably, I re-realize that it's a big
enough job that I don't care *that* much about those huge revisions.
After all, how often do I really change those images? 

In summary, binary diff hasn't happened yet because, among competent
coders, the pain of living without binary diff hasn't exceeded the pain
of effort to solve the problem. :)

> But sure, delta-compression is something that could be done with a smart
> server, as has been discussed in the past.  There's no reason this would
> have to break backwards compatibility.

I'm for delta-compression because it implies that revisions will be
backed together in one file. This would reduce the number of round

As a side note, in many cases it would reduce the pain of lacking bdiff
support. While not every binary compresses well, the particular images
I'm dealing with in my tailored tictactoe game would reduce from 1.4
megs down to 45k. 

> It wouldn't be hard to imagine extending the changeset format to include
> a delta from a higher-level tool that knows about the file format, *in
> addition* to the regular GNU diff.  That way if there is a conflict, the
> user's tla could optionally call out to an external program which would
> make use of this information.  Otherwise, they just get the plain diffs.

This would be ok if, and only if, the unpacking tool could arbitrarily
handle the unpacking of any file.

In other words, lets hypothosize that there were a smartgzip, that via
the usage of plugins, could handle using specialized algorithms to
compress files (those that didn't have the special plugin for that 
fileformat would still be able to pack, but would end up using a 
less efficient, more generalized packing method). However, anything that 
was packed, generically or with a special plugin, would need to be
unpackable without those plugins. 

Sure, smartgzip is technically achievable, but I'm not aware of anybody
working on this in the free software world. 

> Is this just a really obtuse way of saying "cacherevs for older
> revisions aren't automatically mirrored"?  That's an easy to fix bug,
> and I think it already has been.

How would you fix it? 

>>       * The GNU arch developers believe that it's easy for all
>>         developers participating in a project to publish a repository.
> I don't know how arch could possibly make it easier.  What do you
> propose instead?

I can back this up by *authoritively stating* that it is easy to
publically publish an archive. All that anybody in the world needs to do
is send email to address@hidden with the following three pieces of

1. A statement that your archive is free software
2. Your full name
3. Your ssh key

James Blackwell          Try something fun: For the next 24 hours, give
Smile more!              each person you meet a compliment!

GnuPG (ID 06357400) AAE4 8C76 58DA 5902 761D  247A 8A55 DA73 0635 7400

reply via email to

[Prev in Thread] Current Thread [Next in Thread]