gnu-arch-users
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Gnu-arch-users] Re: darcs vs tla


From: Dustin Sallings
Subject: Re: [Gnu-arch-users] Re: darcs vs tla
Date: Mon, 8 Nov 2004 15:20:51 -0800


On Nov 8, 2004, at 2:08, Catalin Marinas wrote:

Timothy Webster <address@hidden> writes:
I would like to hear from users who have tried both tla and
darcs. And specifically why I should not go with darcs.

I use both quite a bit and really have a difficult time trying to figure out which one I like more.

(I'm trying to use generic language below since there are different terms to describe concepts in arch and darcs...hopefully, like me, you'll find that fact more annoying than the terms I'm using)

Darcs has some nice concepts such as breaking a checkin into multiple checkins at commit time (i.e. I changed two parts of a file, but this changeset should only reflect the changes I made to the bottom of the file, not the stuff I did at the top or in the middle.).

It's also very nice that a working directory is effectively a branch from a checked out tree. This is a very natural concept to branching. I was arguing with a friend about how bad things like CVS break people's mentalities and prevent them from doing better and he gave me some sort of ``9x% of the time all I do is ci and up,'' regarding branching. I pointed out that in darcs, an ``update'' is a branch integration and there's no way to distinguish the two. That's one of the nicest things about using it.

The branching in general is actually very nice in its simplicity. There are no limits on integrations (that I can tell) short of patch dependencies, and it doesn't require you to think about branching or offline development before you find yourself in a foxhole somewhere without connectivity.

I end up sending patches via email to a central repository as well, which I find to be very nice.


However, the lack of separation between a repository and a working tree can be a little odd. Having multiple projects in a single unit of repository in arch seems nice in that there's one thing I have to worry about setting up and incrementally adding distinct projects to it that are related only in how I think about them. With darcs, I do have common project directories, but I have to manage each piece separately (although I have scripts that help with a lot of this).

Also, in arch, a branch has a separate patch space than the tree from which you branched. This, along with cacherevs can make things a lot smaller and easier to look at (although darcs has some similar concepts, they're not quite the same). This also gives you the opportunity to have a long-developed feature in a branch be merged as a single changeset instead of having each little checkin pulled in.

main problem with it - it is incredibly slow. I tried it with the
Linux kernel (~300MB sources) and the commit operation (after applying
an 18MB patch) took around 3 hours, in which time my machine was
completely unusable.

The extreme case isn't handled all that efficiently yet, no. I believe all this tells you is that it's possible to handle very large projects, although if you actually have a project with this much sort, it might not be recommended just yet.

existing structure or not). Even if this would be implemented, more
engineering needs to go into it before it could cope with the level of
patches in the Linux kernel (around 50 patches a day).

Again, that's a phenomenally big project. I'm not arguing that Linux is particularly well designed or managed, but it's extremely rare for any project with that kind of commit rate to exist in the world with any sort of quality. Plenty of software houses might have that kind of rate, but not necessarily in a single project.

A second problem I think is Haskell. Not so many people can help with
coding and it is also much slower than C or C++. The today's compilers
are not smart enough to optimally deal with pure functional
languages.

This is clearly wrong. Haskell was the #1 reason that pointed me in the direction of darcs (and no, I didn't know very much of it at the time). I greatly support projects creating software in higher level languages instead of holding so fast to the belief that it'll be slow if they do it in anything other than C.

I write a lot of code in OCaml (not purely functional, though most of my code is), and I can assure you *that* compiler optimizes very well compared to gcc. It does not seem intuitive to me that a low-level compiler such as C could optimize better than a high level compiler such as that of ghc, ocaml, eiffel, etc... Expressing what you want at a high level gives the compiler much more flexibility in how it can deal with it (i.e. ``move data over there'' vs. ``allocate a 32-bit integer pointer to the top of this buffer and seek to the first null position [...]'').

Anyway, my experience has shown me that I can get far faster apps with less code (i.e. sooner) by avoiding C, and have them be more stable to boot (we all make mistakes).

With darcs you also need to understand its theory of patches since it
doesn't report a conflict for cases where arch does (this is where I
think darcs should at least let you know).

Perhaps it should let you know, but I think this is more of a workflow issue. I.e. perforce lets me know when there are conflicts, but it's not that easy to read (conflicts and updates look very similar), so I end up wrapping my updates in a script that does my update, automatic conflict resolution, tagging, and occasionally branch integration at one time.

Arch's patches are more
readable since they are based on the diff format.

That's not exactly true. A darcs patch is a single text file, while an arch patch is a tarred up directory with standard diffs along with other supporting files.

While darcs is a nice research project, my recommendation would be to
stay with arch, at least until you hear somebody happily using darcs
with a huge source tree like the Linux kernel.

I don't know that a source tree like the Linux kernel is all that necessary, but darcs itself has had nearly 2,200 patches since 2002. This is compared to about 4,000 in a project at my company with what I consider to be a fairly rapidly developed project since December 2001. (Actually, this project, too, is broken into two trees of about 4,000 patches and 3,500 patches in the same timeline).

While I do believe it's a good metric, how a system handles the most extreme case you can find isn't necessarily a practical way to determine what's a good fit for you.

--
Dustin Sallings





reply via email to

[Prev in Thread] Current Thread [Next in Thread]