savannah-hackers
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Savannah-hackers] Would a couple of you please study arch?


From: Richard Stallman
Subject: [Savannah-hackers] Would a couple of you please study arch?
Date: Fri, 29 Mar 2002 20:45:01 -0700 (MST)

Would a couple of people please take a look at arch
and tell me how it compares with CVS (and with subversion,
if any of you knows)?

Please ack to me if you are going to do this.  It may take time to do
this, but I would like to know soon that it is being done.

Tom Lord later said:

    I've made the first public release of `arch', a new revision control
    system.  You can find it at:

            http://www.regexps.com


Date: Fri, 28 Dec 2001 15:46:17 -0800 (PST)
Message-Id: <address@hidden>
From: Tom Lord <address@hidden>
To: address@hidden
Subject: revision control systems
Content-Type: text
Content-Length: 10359


I've written a free software source code management and revision
control system called `arch'.  I think `arch' compares well with
CVS and Subversion and some of the commercial competition.

Some quick highlights of the feature list are:

        + distributed databases -- each hacker or group can host their
          own branches.  There's a global (world wide) name-space for
          lines of development and revisions.  Branches can be formed
          from any repository to any other and merge operations can 
          span repository boundaries without needing to actually
          duplicate the full contents of a repository at each site.
          
        + fancy merging -- `arch' has support for various styles
          of history-sensitive branch merging.  The way branches
          and patch-sets interact with distributed repositories
          makes it practical to distribute the responsibilities
          for patch-review and merging.

        + renames handled -- of course file and directory renames
          are handled accurately.  So are symbolic links and file
          permissions.

        + unobtrusive operation -- `arch' is designed to stay out
          your way while making changes and rearranging files.  It 
          is designed to have a clean and self-documenting
          command-line interface having the finest characteristics of
          good Unix tools.

`arch' is, at its core, a collection of shell scripts and a tiny bit
of new C code.  It brings many classic shell-utils, FTP, diff, and
patch together and turns them into a distributed version control
system.  In spite of the simplicity, `arch' is not a toy: its quite
sophisticated and, in my opinion, elegant.  It captures the style of
diff/patch use that we used to use before remote-CVS took over the
world, fills in some gaps, and packages the whole deal behind a nice
(command line) user interface.  Competing RC systems are far more
complex than they need to be.

Enclosed below is a longer list of `arch' features.

Could you let me know if `arch' is interesting to you?  I'm trying to
find a commercial sponsor to help move it forward.  One obstacle I've
encountered is that arch is new so there isn't yet "enthusiastic
community support" for it -- a sort of chicken-and-egg problem.

`arch' is newer than other systems -- so it is less tested.  From a
hacking point of view, what I'd really want to be able to do is a few
months of intensive and focused testing and tuning, culminating in
applying it so some larger projects.

A user's guide for arch, describing most of the features and how to
use them, is available at:

        http://www.regexps.com/super-secret/arch.html

regards,
-t


                 Key Features: Branching and Merging

* Fancy Tagging, Branching, and Merging

  `arch' is designed with unprecedented support for developing on
  branches and performing complex merges with automated assistance.

  Forming a branch (or tag) is inexpensive in both space and time.
  Tags are revisioned -- meaning that complete history is kept of how 
  a tag has been applied.

  For merging, `arch' provides a number of operations:

  `update': a `CVS'-style merge operator (diff the working copy
  against a common ancestor (from any branch) and apply those diffs to
  the latest revision).

  `replay': a `Subversion'-style history sensitive merge operator
  (apply to the working copy all deltas that are found in the latest
  revision (from any branch) but not previously applied to the working
  copy).

  `reconcile': an operation unique to `arch' which plans a
  multi-branch `replay'-based merge, finding an ordering of patches
  from those branches which minimizes sources of potential conflicts.

  `i-merge': another operation ("idempotent merge") unique to `arch':
  i-merge forms a revision whose delta from its ancestors consists
  entirely of merges with other branches (any combination of `update'
  and `replay').  `replay' and `update' can treat such deltas
  specially, skipping them for trees that have already undergone
  similar merges.  `i-merge' makes history-sensitive merging more
  effective and helps a team of programmers avoid having to repeatedly
  solve the same set of merge conflicts.  (The `i-merge' feature is
  the only one mentioned in this message not done yet.  Based on my
  experience implementing similar feature, `i-merge' needs 2-3 days to
  get working and pass initial testing.  I've postponed implementing
  it until I have a chance to work on `arch' full-time again -- using
  the planned feature as a kind of cognitive book-mark to recover my
  state after being away from the code for a few weeks.)

  `replay --exact' and `replay --list': operations which allow you to
  apply revision deltas in any user-selected order, while still taking
  advantage of history-sensitivity.

  `mkpatch' and `dopatch': `arch''s "next generation" replacements for
  `diff -r -c' and `patch'.  These can be used to perform arbitrary
  delta computation and applications on working copies.


* Directory and File Renames Handled Cleanly

  Changes are tracked across file and directory renames.  For example,
  if you have a local working directory and "update" against the
  repository (merge changes in the repository with local changes) -- and
  either or both the repository or your local tree has been
  "rearranged" -- the merge process takes those renames into account.
  As a practical matter, this creates an important new degree of
  freedom for developers: the freedom to "clean up" code by improving
  its organization without having to pay a high cost in revision
  control system maintenance.

  

                      Key Features: Repositories


* Distributed Revision Databases

  `arch' has a global (as in "world wide") name-space for revisions.

  `arch' seamlessly integrates all accessible revision repositories,
  both local and remote, into one large database.  Branches can span
  repository boundaries, etc.  That has big implications for open
  source processes, both intra-organizationally, and on a global
  scale.

  Each developer or organization can have a private database for
  day-to-day work, or for organization- or feature-specific branches.

  Loosely cooperating organizations can have separately administered
  repositories that, nevertheless, mutually support branching and
  merging.
  
  An unwelcome source of de-facto authority (hosting a public
  project's `CVS' repository) is undermined by `arch'.  More
  positively, `arch' lowers the barriers to coordinated
  inter-organizational development: if your repository is publicly
  readable, anybody can create branches -- there is no need to hand
  out write access to everyone who wants to play.


* Low Cost Server Administration

  `arch' remote repository access is via the FTP protocol.  An `arch'
  server can be a generic (unix-based) FTP server.

  Server administration requirements are minimal: databases can be
  created trivially and (unlike `CVS') never become wedged (except as
  a result of file system failures (or, sigh, bugs -- if there are
  any)).  Repositories can be easily migrated.  Repositories can be
  mirrored for read-only purposes.


* Atomic, Concurrent, Independent, and Durable Transactions

  Commits are atomic.  Concurrent commits to separate lines of
  development are permitted.  Commits are independent of "gets"
  (check-outs).  Commits are durable to the limits of the underlying
  file system.  If a commit hangs (say, a client dies) with locks 
  held -- those locks can be broken remotely.



                        Key Features: Logging
  
* Useful Semi-Automated Logging

  `arch' log entries contain lots of automatically generated
  information that is useful for browsing repository history and for
  performing intelligent (history sensitive) merges.


* Automatic ChangeLog Maintenance

  `arch' can automatically generate GNU-style ChangeLog files from
  revision control log entries.  If your tree contains automatically
  generated log files, `arch' will update them during `commit', and
  after every merge operation that changes a revision's patch history.


  
                     Key Features: User Interface

* Patch Set Browsing

  Any patch set, for a committed revision, between a working copy and
  its ancestors, or between arbitrary trees, can be summarized in an
  HTML-formatted report, with lists of renamed files and directories,
  and hyper-links to individual file deltas, added files, and removed
  files.  This is a boon to developers writing log entries and to
  patch reviewers.  One of my favorite commands has become: 

        netscape --remote "openURL(`arch what-changed --url`)"


* Command-line Driven, Self-Documenting

  `arch' is a collection of small and simple software tools.  The
  collection has very regular and thorough conventions for option
  names and defaulting behavior.  Every command has an extensive
  `--help' message describing its options and functionality.  The
  command `arch --help-commands' gives an orderly summary of all the
  commands available with brief descriptions of each.


* Far More GUI Work Possible

  `arch' is designed from the ground up to be layered under separately
  developed GUIs.  For example, `arch''s log entries contain enough
  information to drive a graph-drawing branch-merge graph of revision
  history, conveniently represented as plain-text data in RFC822-style
  message headers.



                  Key Features: Performance Metrics

  
* Pretty Fast, Efficient Use of Bandwidth, Effective Use of Disk Space

  `arch' seems to be pretty fast, and for good reasons.  Tree-deltas
  (patches) are exchanged with servers as compressed tar files.
  `arch' makes clever use of client-side caching.  On my
  (unremarkable) system, `commit' processes around 10 files per
  second.  (Rigorous comparative benchmarking and final tuning remains
  to done, however).


* Maintainable Size

  The heart of the implementation (around 30K lines) is (ahem) almost
  entirely shell scripts and awk code.  (This is not a joke -- `arch'
  is a serious system.)  In spite of the size and implementation
  languages, `arch' is more featureful than `CVS' and seems to be
  faster at common-case operations.


* Useful Subsets Small Enough to Add to Other Source Packages

  It is practical to distribute a tiny subset of `arch' with any of
  your source packages.  Contributors without repositories can use
  that subset to prepare `arch'-compatible patches or to apply `arch'
  patch sets.


regards
-t





reply via email to

[Prev in Thread] Current Thread [Next in Thread]