[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

a new run-time for Scheme implementations

From: Tom Lord
Subject: a new run-time for Scheme implementations
Date: Thu, 26 Jul 2001 17:50:55 -0700 (PDT)


  The project proposed below might be a good opportunity to 
  build abstractions around the taggins system, for example.


I would like to guage whether a particular project I have in mind is
perceived as useful to other projects, whether there are hackers who
would like to work on the project, and whether we might find
financial support for the project.

The project I have in mind is to build a general purpose C library,
one that could either replace or augment most of libc, depending on
how it is used, and that includes a run-time system for high-level
languages such as Scheme, Java, and various functional languages.

Some of the high-level requirements that I have in mind are summarized
here, and spelled out in detail at the end of this message:

        * Scheme-like Types, Cleanly Available to C Programs
        * Java-like Types, Cleanly Available to C Programs
        * Thread Support
        * Unicode Support
        * Fancy Text Support 
        * A Replacement for Stdio
        * An Exception Mechanism
        * Robust Garbage Collection
        * Developed With Continuous Testing and Documentation

While building such a run-time System, we could simultaneously work on
tightly integrated implementations of Scheme, Java and similar
languages.  That would help to validate the design of the run-time
system.  (In case it isn't obvious, I have a strong hunch that there
is a tight, practical, clean language design that has, as subsets,
Scheme, a subset of Java (or some other statically typed imperative language 
with GC-based allocation), and some functional languages, both eager
and lazy.)

So: who's interested in such a library; who's interested in
contributing time, money, or other resoures; and why are you
interested?  Please reply with your answers.  If there is sufficient
interest, I'll set up a mailing list.

Here are my answers:


I'm particularly interested in working on these subsystems:

        low-level Unicode support
        strings, including editable, attributed text
        the representations of Scheme-like and Java-like data types
        the interfaces and interactions between various subsystems
        at least one of the GC implementations
        a Scheme implementation built on top of the library

I'm interested in coordinating the project and gatekeeping patches.
If I can be paid for it, I'm interested in working on this nearly full
time, at least until the run-time system is reasonably "done" and

I'd be happy to act as the "design czar" for the project, with the
goals: (1) decide in a coherent way the large number of arbitrary
questions that are likely to arise; (2) decide the large number of
objective issues on the basis of consesus among the best qualified
contributors and customers.

As gatekeeper, I'd insist that new components include thorough
development tests (that run when you type "make test") and good
documentation before they're checked in to the primary development

I prefer that sets of related changes be made submitted as complete
patch sets, so they can be more easily reviewed and otherwise

I have a server that could be used to host the project.  

I have a C library that I think would make a good starting point for
the project.  

I have considerable experience implementing Scheme, and a small amount
of experience with Java implementations.  I have a reading knowledge
of implementation techniques for functional languages.  I am quite
good with many of the relevant data structures.  I have a reading
knowledge of GC techniques and considerable experience with one (alas,
conservative) GC implementation.  I have exquisite taste in C coding

                        WHY I WANT TO DO THIS

Well, ego gratification, of course.  But also:

I've wanted a run-time system like this for several years.  Primarily,
I want to use it as a foundation component for a Scheme-based 
application framework.  Additionally, I'm quite sick of the
limitations of libc.

In the past, I've tried to make progress on this thing in the context
of other projects.  I've found that approach to be hopelessly

I think that several Free Software language implementations are in
trouble, in part because they depend on a broken GC.  In the name of
helping to further the success of Free Software, I'd like to help fix
these problems.

This project is an essential part of what I most want to hack on, so
I'd like to get paid for it.


        * Scheme-like Types, Cleanly Available to C Programs

          The run-time system should include Scheme-like data types,
          making these available to C, independent of any complete
          Scheme implementation.  For example, the library would
          provide garbage collected cons pairs (lisp lists), vectors
          (lisp arrays), strings, symbols, and arbitrary precision

          The run-time system should contain support for reading and
          writing these data structures in a variety of formats
          (ordinary lisp notation, pretty-printed lisp notation, and
          fast binary representations).

          Scheme-like data structures, by virtue of their generality
          and small number, facilitate many concise coding idioms and
          promote good code re-use.  I think they are intuitive (it is
          easy to picture how they work and to predict performance
          characteristics).  Such data types should be used more often
          than they are.  For example, an efficient implementation of
          these types, and a clean syntax for writing them, would make
          a welcome addition to Java.

          This feature does, unfortunately, present a challenge to 

        * Java-like Types, Cleanly Available to C Programs

          Similarly, the run-time system should include Java-like
          objects, or better, a simpler object model in which
          Java-like objects can be implemented.

          I am aware that some work has been done unifying C++ and
          Java-like objects.  I'm not overly familiar with this work,
          but my initial impression was that it is a narrowly focused
          solution and that I want something a bit cleaner and more
          general.  If the existing C++ work is good, then perhaps 
          it would be compatible with our run-time system.

        * Thread Support
          The run-time system should be able to take good advantage
          of threading on multi-processor machines.

        * Unicode Support

          The run-time system should have good support for
          all of Unicode and presumed future extensions to Unicode.

          This doesn't necessarily mean using ICU.  ICU puts a lot
          emphasis on compatibility with older libraries and on
          transcoding, neither of which are especially important to
          this project.  ICU seems too large and complicated for our
          purposes.  I have the foundation of a Unicode library 
          that I think is a better fit.

        * Fancy Text Support 

          The run-time system should have good support for editable,
          attributed text.  This support should facilitate integration
          with Pango or other libraries dedicated to rendering 
          attributed text.

          Typically, such a facility is built into a GUI toolkit,
          rather than a C run-time system.  I think it makes more
          sense to provide this facility at a lower level, tying it to
          generic data types rather than widgets, facilitating greater
          code re-use, encouraging use in non-graphical applications,
          efficiently integrating with the I/O subsystem, etc.

        * A Replacement for Stdio

          Stdio persists solely because it has momentum, not because
          it is a good design.  Andrew Hume's I/O library pointed to a
          better way to manage buffers (so as to avoid needless
          copying of data).  My I/O library builds on that idea,
          adding support for stackable I/O protocols, and making all
          of the interfaces descriptor based rather than file object
          based.  Having used my library for a few years now (even
          though it is not quite finished), I am convinced that there
          is no better approach currently available.

        * An Exception Mechanism

          The run-time system should have high-level support for an
          exception mechanism.

        * Robust Garbage Collection

          The run-time system should include a robust garbage
          collector.  Nearly every popular, free, implementation of
          Scheme and Java that I have seen uses a collector that is at
          least partially conservative.  These collectors count as GC
          roots any value on the C stack that "looks like" a pointer
          to valid data.

          Such collectors have a serious problem: they leak storage.
          It is not difficult to create situations where the collector
          wrongly treats some value on the C stack as a GC root --
          either because the value looks like a pointer, but isn't, or
          because the value was once a pointer, but is now a dead
          value that was never overwritten.

          Such storage leaks can cause obvious problems, like
          processes that are too large, and less obvious problems,
          like processes that leak file descriptors.  In either case,
          long-running programs or life critical programs are poor
          candidates for these collectors.  I admit that the bugs
          actually occur infrequently, and many people use such
          collectors happily, but then people do lots of crazy things.

          One way to fix such collectors is to modify a C compiler,
          such as GCC.  It would, perhaps, be nice if GNU C had
          support for precise scans of the C stack, and presumably
          something along these lines will eventually be implemented
          as part of a C# compiler.

          Nevertheless, in my opinion, it is a poor idea to tie your
          run-time system to a particular compiler.

          Another way to fix the problem might be look for a clever
          application of C++ features, especially automatic
          destructors.  Another way would be to use C#.  Still another
          way would be to use C, but require explicit memory
          management by C programs.

          My current prejudice is to use C, require explicit
          memory management, and to also build a C++ interface,
          if one can be built that simplifies programming.
          I think that C is a fine language, when used properly,
          and that robust GC support need not be too hard to use.
          I have heard Emacs hackers complain that explicit GC
          management is difficult and error prone.  I think this
          is easily fixed by a lint-like tool.  Some years ago,
          I built a lint-like tool that verified some GC-related
          invariants in a C implementation of Scheme.  Although my
          tool was a prototype, it usefully discovered several bugs
          that were easily fixed.  This approach could be portably
          and robustly applied.  For convenience and performance,
          if there were demand for it, a version of this tool could
          be built into GCC.

          I'm not yet familiar enough with C# to be comfortable
          recommending that, and I'm dubious about the nature of its
          origin and the economic politics that surrounds it.  These
          are very weak objections, of course, so I am open to

        * Developed With Continuous Testing and Documentation

          The development process should be characterized by 
          continuous testing, and continuous documentation.

          I don't think there's any other sane way to undertake a project
          of this scope and complexity.

reply via email to

[Prev in Thread] Current Thread [Next in Thread]