[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Why Emacs needs a modern bug tracker

From: Eric S. Raymond
Subject: Re: Why Emacs needs a modern bug tracker
Date: Sat, 5 Jan 2008 13:24:56 -0500
User-agent: Mutt/1.5.15+20070412 (2007-04-11)

Eli Zaretskii <address@hidden>:
> You are implicitly assuming here that what is good for a COCOMO-25
> project and 10-12 active developers should be also good for a
> COCOMO-328 project with fewer than 10 developers.  Do you have any
> evidence that this assumption is true, or arguments that would tell me
> such an assumption is reasonable?

Yes, I think I do.  Let's consider some of the scaling curves.

First, the size of a project's bug load is driven by the square of
LOC.  This is because most bugs are clashes between assumptions mode
in differing parts of the code.  Modularization can reduce such
clashes, but it's basically unheard of to get *quadratic* reduction,
especially on large old codebases; the best you can really hope for 
is a linear reduction (cf Clark and Baldwin's "Design Rules", 1999).

The utility of a tracker is (at least) proportional to the size of the
bug load, because one of its functions is to help identify the N most
critical bugs at any given time.  Arguably it's proportional to the
*square* of the bug load -- one of its other uses is to identify and
record bug interdependencies.

Therefore: best case, tracker utility TL rises as the square of LOC,
Worst case, it rises as the fourth power of LOC. This would neatly
explain why the jump from 60K lines to only 100K puts GPSD and Wesnoth
in regimes that are qualitatively different.

(For what it's worth, my own belief is that bug interdependencies have
the statistics of a scale-free network.  I can explain why in detail if
you care; it goes back to Ross Anderson's papers applying statistical
thermodynamics to model bug distribution in large systems. In that
case the actual utility curve of the tracker is somewhere between
LOC**2 and LOC**4, at about (LOC**2/k) * log(LOC**2/k) where k is a
constant measuring the degree of modularity in the code.  LOC**2 will
underestimate this substantially unless k is absurdly large.)

But to be as friendly as possible to your skepticism we'll set the utility 
function TL(LOC) = m * LOC**2, for m an unknown constant.

Now, remember that the LOC ratio we're talking about is 10:1.  That
means that, best case for your skepticism, a tracker should be O(10**2)
times as valuable to Emacs as it is to Wesnoth assuming we hold the
number of developers for both projects to the same constant (doesn't
matter what it is).  Worst case, O(10**4) times.  

Now let's suppose that the utility curve of a tracker with respect to
some number of developers d is modeled by an unknown function TD(d),
for LOC constant.  And that the joint utility T(LOC, d) is some linear
or multiplicative composite of TL(LOC) and TD(d), eg T(LOC, d) = a *
TL(LOC) + b * TD(d) + c * TL(LOC) * TD(d) for unknown constants a b c.
This is the friendliest possible assumption for you.  In fact the
joint function probably has nonlinear and monotonic-increasing terms
in both variables.

In order for a tracker to be less valuable to Emacs than it is to
Wesnoth, TD(d) would have to drop towards zero so much faster than
LOC**2 that it would swamp a more than two-order-of-magnitude
difference.  in TL(LOC).

To be concrete: I know Emacs has a minimum of about 6 developers just
from watching the list.  Let's say Wesnoth has 12.  That's 2:1.  A 2:1
drop in d would have to swamp a 10:1 rise in LOC.  Not plausible
at all.

(Note one of the things that has changed since Emacs practices
assumed approximately their present form in the early 1990s; back
then, the scaling laws I'm applying were not at all understood.
The germ of them had been present in Brooks's Law c. 1975, but 
it took a lot of work by a lot of people, including Clark and
Baldwin and Ross Anderson and Les Hatton and -- er -- me to
get from vague intuitions to even a qualitative scaling theory.)

> > One of the places a real issue database is most concretely useful is when
> > you're triaging bugs to close on a release.  It is *immensely* helpful
> > in making clear what needs to be done and at what point you are
> > finished doing it.
> In Emacs development, we have problems to even find a release manager,
> let alone someone who will replace Richard as a head maintainer.  So
> having a bug triage system that is significantly better that a flat
> text file such as admin/FOR-RELEASE is not necessarily the first
> priority here.

Perhaps not.  But it certainly couldn't hurt.  And, maybe, if
bug-triage weren't quite so much like having a herd of elephants
stampede over your testicles, a release manager might be just a
*leetle* easier to find?
> Emacs is HUGE.  Its immense size is not the only problem: there are
> many parts in it that require experts in specific areas (GUI display,
> networking, Lisp infrastructure, email, multilingual text editing, to
> name just a random few) in order to know what is right and wrong when
> reviewing patches.  Just figuring out how best to organize maintenance
> of such a large package is a daunting task, to say nothing of actually
> implementing such a maintenance scheme (which would mean finding and
> recruiting individuals who could become part of such a team, then
> making a coherent and cooperative team out of them).  

All this is certainly true.

>                                                   It is IMO naive
> at best to think that switching to more collaborative tools would
> somehow magically solve these _real_ problems, or even pave a way for
> their _practical_ solution.

It would be tremendously naive to believe that better collaborative
tools will magically solve these problems. But that's a straw man; I
don't believe it, and you are *certainly* not stupid enough to suppose 
that I do.

But when you exclude the possibility that they might pave the way...
start asking youself how many potential contributors Emacs has lost 
because the project toolkit looks like stone knives and bearskins.

After that unnerving experience I had on 29 December, I can name
names: David Matuszek.  Cyndy Matuszek.  Toren Smith.  Matt Taylor.
Donna Malayeri.  That's five Emacs users and potential contributors
lost right there.  And you know what?  If I had known it was
important, I'm certain I could have walked three steps unto the
Matuszeks' living room and collected five more refuseniks just
from the faces I knew, let alone the strangers.

That's from a single point sample on a single night, and it's more
people than you'll admit to having on your entire dev team.  Wake up;
it's later than you think.
                <a href="http://www.catb.org/~esr/";>Eric S. Raymond</a>

reply via email to

[Prev in Thread] Current Thread [Next in Thread]