[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Gnu-arch-users] round 2 of GCC v. Arch

From: Tom Lord
Subject: [Gnu-arch-users] round 2 of GCC v. Arch
Date: Wed, 23 Jun 2004 11:19:54 -0700 (PDT)

I think we've pretty much beat into the ground the commit rate issues 
and all agree that some minor variation on a PQM is needed.

I'm disappointed that my suggestion of a "narrative branch", a simple
solution to log pruning that gives both tiny logs _and_ access to
unabridged history, didn't inspire more reaction but that's ok: I'm
satisfied with it.

The next things are more mundane (from an arch perspective) but are
perhaps worth pointing out (or repeating) for "record" as might be
seen by any GCCers:

Aside from a busy mainline, three other characteristics of GCC that
catch my eye are:

1) the challenges they face dealing with long-lived branches
2) the strict use of development phasing for release management
3) the importance of detached operation

1) long-lived branches in GCC

  GCC is pressured to remain competitive (especially in the quality of
  generated code) against competition such as Intel's C compiler.

  At the same time, since the time that the GCC framework was first
  created, developments in the computer science of compiler
  implementation have revealed some ways in which the GCC framework 
  is showing its age:  known kinds of optimization, found in ICC,
  that are hard to implement in GCC without substantial changes.

  It takes time to make such sweeping changes and, moreover, it is 
  important to manage their development without disrupting the faster 
  pace of "ordinary" mainline changes.   So, such changes are made on 
  branches, remaining there across multiple GCC releases before
  finally being merged.

  Obvious to archers but perhaps not to others is that such branches
  are most naturally regarded as being in a star-topology with the
  mainline, thus the easiest-to-use and most effectively
  history-sensitive merging features of arch apply: keeping such a
  branch in sync with mainline, and even merging some of the work from
  branch into mainline "early", are both notably easier to do with
  arch than with most other systems.

  Less obvious, perhaps, are the benefits that can be realized if 
  there are multiple long-lived branches being developed concurrently,
  especially if they are likely to "touch" common areas of the tree:

  Just as the busy mainline serves as a "continuous, semi-automated
  integration branch", so too we can construct additional
  semi-automated integration branches to combine various subsets of
  our long-lived patches.  These secondary integration branches would
  not, usually, be for users or developers to access directly.  They
  would not distract users who would normally be testing the mainline.
  Rather, the secondary integration branches would simply give
  (automated) warnings whenever the branches begin to textually
  conflict and could be, if desired, made the subject of automated
  nightly testing.

2) development phasing

  GCC, of course, does a phased release management in which as
  releases approach, restrictionso are imposed on how the mainline
  is permitted to change.

  During the slush and freeze phases, one goal of the release manager
  is to keep the attention of contributors and volunteer testers
  squarely on the mainline.   Of course, contributors with their own
  agendas may experience this as a source of frustration when some of their
  changes can't yet be merged into mainline until the freeze is 
  lifted.   There are occaisional heated (well, "warmed") discussions
  during freezes in which consensus is reached about the right 
  path through the freeze period to minimize its length in time while
  maximizing the quality of its results.

  A "radical" idea here, and one that goes against traditional
  thinking in GCC project management, is to instead fork for releases,
  allowing mainline development to continue.  Is it possible that, 
  absent from being _required_ to work on a release before doing new
  mainline work, the developers can, instead, each find the right
  balance between working on the release and going ahead to get
  something out of the way by putting it on the continuing mainline?

  Certainly, at times when the release after the one currently frozen
  is expected to contain major merges, such as from long-lived
  branches, there is an advantage in getting those changes into
  mainline as quickly as possible.  Why not before the frozen release?

  Once again, PQM can help.  A continuing mainline would, as with our
  other cases, be (in effect) a star-topology branch of the new
  release hub (or vice versa -- it doesn't matter which we call the
  "hub").  Back-patching from the slushy or frozen release branch to
  the continuing mainling can be semi-automated, with notices mailed
  out when conflicts occur and human intervention is needed.

  In effect, the release branch would function analogously (wrt to the
  continuing mainline) to the kind of developer branch that normally
  feeds the pqm for (development phase) mainline.

  This _could_ distract hackers and volunteer testers from working
  on the release in a timely fashion but, on the other hand, it
  might have a more desirable effect:  namely to increase overall
  productivity by not having to "idle" developers who wouldn't
  otherwise be doing much on the release but currently are blocked
  from continuing (shared, committed) work on the post-release 

3) detached operation

  Three observations:

  a) Periodically, GCC's CVS service is interrupted for one reason or
     another.  When this occurs, all work on mainline must, by
     definition, stop.

  b) GCC committers are widely geographically scattered and scattered
     widely on the topology of the Internet yet a CVS server must be
     centrally located.  Thus, there is no unambiguously good place to
     locate that server: some committers will be well served at the
     expense of others who are inconvenienced by high latency and
     perhaps constrained bandwidth to the server.

  c) The number of committers is very high but the number of
     contributors higher still.  Many who do sustained work on GCC are
     not committers.

  Ad hoc approaches such as making rsync copies of the CVS repository
  and building bridges between the two copies result.  Non-commiters
  and unfortunately-located committers alike can use this technique
  to ensure that the data they access most is available locally.
  Non-commites can receive additional benefits by being able to 
  use a revision control system for their work.   Such a private
  mirror also serves as a temporary work-around should the real
  CVS server be down for a time.  

  Arch, of course, turns that ad hoc approach into a built-in
  solution, a deep part of the core of arch.  It is ordinary and
  expected for arch users to work from local mirrors and in personal
  archives which can contain branches from the official archives.

  Moreover, the distributed and detached capabilities of arch
  afford a great deal of risk mitigation against the possibility of
  the primary archive of a project being lost of an _extended_
  period of time.   If it were known, for example, that the GCC 
  mainline would be gone for a week or a month, in arch, it would be
  trivial to create a new, temporary mainline, branching from 
  the now down ordinary mainline, direct a PQM at the new mainline,
  continue work there, and when recovery is complete, simply merge
  that work back into the ordinary mainline.


reply via email to

[Prev in Thread] Current Thread [Next in Thread]