[Gnu-arch-users] Round II -- new language, arch, furth, etc.

gnu-arch-users

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Gnu-arch-users] Round II -- new language, arch, furth, etc.

From:	Tom Lord
Subject:	[Gnu-arch-users] Round II -- new language, arch, furth, etc.
Date:	Tue, 20 Jul 2004 13:57:45 -0700 (PDT)


* Review:

  So far we have "xl", a very simple configuration language in which
  names can be bound to values.  Values can be numbers, strings,
  symbols, sequences of other values, or associative mappings of names
  to other values.   A typical program might look like:


        (define my-id "Tom Lord <address@hidden>")

        (define my-revlib-path '("/v/big/lord/{revlib}"
                                 "~/{revlib}"
                                 "/v/big/shared/{revlib}"))


  We've (to a useful-enough level of detail, for now) seen the grammar
  for xl (so far) and the semantic rules that define correct programs
  (such as no multiply-defined names and undefined variables being
  bound to #undefined).  The informal definition we've given could be
  made mathematically precise pretty easily.  We've sketched a trivial
  C API for this language and the run-time representation of values.


  So far we have no way to cause side-effects in xl, no way to run
  loops, and, indeed, only constant expressions and top-level 
  definitions.

  xl-so-far has an interesting property: all xl programs can be
  executed in a finite number of steps (specifically, 0 steps).
  I posit that this is an essential characteristic of anything that
  we'd wish to call a "configuration language".

  xl-so-far has another interesting property that we haven't seen
  much use for so far (other than in the API): xl programs can be
  represented as xl data.   It remains to be shown if/how that will
  be useful.

* Call that "xl0"

  We can dub the language-so-far "xl0" and it is, in and of itself,
  a perfectly good configuration language.   Above all, it's boringly
  simple and does little more than correct the ambiguities in, for
  example, the "RFC822 header language".

  Someone could now, pretty much, go off an implement xl0.   There
  are some open questions that would have to be nailed down, ideally
  in the form of a formal spec.  For example: what is the extact 
  syntax of strings?  What characters are permitted in strings?  What
  happens if an association contains two definitions for one name?

  I'm not aware of any such "open questions" that are hard to answer.
  I've left out detailed answers for them precisely because, in my
  judgement, they'll be easy to fill in later and getting bogged down
  in the details now would just obscure the presentation.

  So, that's xl0.


* More Expressiveness?

  I've noticed that, fairly often, people wind up supplementing
  static configuration languages (like xl0) with an additional
  layer adding some degree of programmability.

  For example: the X11 consortium programmers augmented "make"
  with the C pre-processor, giving them the ability to define
  and use macros in their makefiles.   

  Whenever that kind of programmability is added as an afterthought
  the result is awkward.  For example, programs reading the config
  file might not know they have to run /bin/cpp.   Or the syntax
  of macros in the front-end language might interfere with the syntax
  of the underlying language.

  _If_ it turns out to be a good idea to add a degree of
  programmability to xl0, probably we will want to "build that in"
  rather than relying on something ad hoc like trying to use /bin/m4.

  At the same time, if programmability is added, will we wind up with
  a turing complete language?   It's clearly not necessary (e.g,. 
  CPP itself is not turing complete).  So I decided on this rule:

        If programmability is added to xl, then:

        a) it must remain true that all programs execute in a 
           finite number of steps

        b) the implementation of an interpreter for xl must 
           remain very small and simple


** Is it Needed?

  Whether or not more expressiveness than xl0 is needed is essentially
  a matter of opinion.   If more expressiveness is provided, surely 
  it will find some interesting uses (see below).   If it isn't
  provided, people will likely wind up working around that, providing
  programmability in other ways.

  The reason to build-in programmability rather than work-around the
  absense of programmability is to help programmers better cooperate.
  If everyone has their own solution for programmability of arch
  configurations then one person's programs won't be useful to another
  person.   If everyone uses the same solution, then people can trade
  little xl programs and use xl to define cooperative infrastructures.

  Here is one example of how programmability might be used:

        In ~/.arch-params, the user wants to define the variables:

                my-default-archive
                my-default-category
                my-default-branch
                my-default-version
                my-default-fq-category
                my-default-fq-branch
                my-default-fq-version

        and wants these to all be related.  For example, the default
        fully qualified category should be the default archive plus
        the default category.

        One way (using imaginary xl code) to accomplish this would
        be using expressions:

                (define my-default-fq-vesion
                  "address@hidden/tla--devo--1.3")

                (define my-default-archive
                  (archive-of my-default-fq-vesion))

                (define my-default-category
                  (archive-of my-default-fq-vesion))

                [...etc...]


  Here is another example of how programmability might be used:

         The user wants to define some version aliases:

                upstream        -- the upstream version
                patch-queue     -- the patch queue for submissions

          however, the definitions of these depends on whether 
          the user is a maintainer or not.   Maintainers should
          use one upstream version and patch queue, others should
          use a different upstream version and patch queue.

          This (hypothetical) program might do the trick:

                (define maintainers '("joe" "jane" "fred"))

                (define upstream
                  (if (member my-user-name maintainers)
                      "address@hidden/proj--devo--1.3"
                      "address@hidden/proj--public--1.3"))

                (define patch-queue
                  (if (member my-user-name maintainers)
                      "address@hidden"
                      "address@hidden"))

           If repeated instances of that conditional test are
           undesirable, this would work just as well:
                 
                (define maintainers '("joe" "jane" "fred"))

                (define alias-info
                  (if (member my-user-name maintainers)
                      '{
                         version => "address@hidden/proj--devo--1.3"
                         queue => "address@hidden"
                       }

                      '{
                         version => "address@hidden/proj--devo--1.3"
                         queue => "address@hidden"
                       }))

                 (define upstream (ref alias-info 'version))
                 (define patch-queue (ref alias-info 'queue))

  These examples illustrate a simple "expression language" with 
  function calls (such as to "archive-of" in the first example)
  and conditionals (such as "if" in the last example).   One
  definition can apparently reference others (such as 
  `upstream' being defined in terms of `alias-info' in the last
  example).

* A Killer App for xl Expressions: Boilerplate Configurations

  The examples above show some potentially interesting uses for
  expressions in xl but may not be terribly compelling.   
  All that they do, really, is save a little typing.

  But the examples suggest a similar use that I, at least, find
  more significant:  boilerplate configurations.   This comes
  down to "support for modularity": helping programmers to divide
  up their work and then combine the pieces.

  Suppose that I send you (or build into tla, as a default) a 
  template like:

    (define my-id "<PUT YOUR ID HERE>")
    (define my-default-fq-version "<YOUR-ARCHIVE/YOUR VERSION>")

  and the rest of the program expressed entirely in terms of those:

        (define my-default-archive (archive-of my-default-fq-version))
        (define my-default-category (category-of ...))
        ...

  I've captured a set of rules that define the most common usage
  pattern for these variables.   You can apply that common-case usage
  to your environment just by fixing up the definitions of `my-id'
  and `my-default-fq-version'.   Your also free to change the
  definitions and use something besides the common-case rules -- but
  if you want the common case, it's been captured in a convenient way.

  Better still, the common case rules aren't "hard-wired" into arch
  in any deep way.   They can be revised just by changing the 
  template xl program.

  The only alternative here would be lots of detailed instructions 
  for how to edit the file.  Something like:

    ; my-default-fq-version  -- your default version
    ; 
    ; Normally, after updating this, you will want to make 
    ; corresponding changes to these other variables:
    ; 
    ;     my-default-fq-branch
    ;     my-default-fq-category
    ;     my-default-archive
    ;     [...etc...]

    (define my-id "<PUT YOUR ID HERE>")

  What if you had to read through and follow the instructions in 
  such comments everytime you want to do something that _should_ 
  be very simple (like instantiate a new master archive and 
  patch queue manager for a new project)?

  Expressions seem to me like an idea worth exploring.


* The Problems with Defining Expressions

  Adding expressions to xl raises a substantial number of language
  design and language implementation issues.

  First, as noted earlier, we have ensure that, even with expressions,
  an xl program can always be executed in a finite number of steps.

  Second, we have lots of little questions that need answering.  
  For example, is this is a valid program and, if so, what is
  the value of `x':

        ; Does the order of definitions matter?
        ; Is `x' 42 or #undefined? 
        ; 
        (define x y)
        (define y 42)

  Another example:

        ; When is syntax checked?  Can this
        ; program be evaluated to produce a value
        ; for `x' assuming we don't care about the
        ; value for `y'?
        ; 

        (define x 42)
        (define y (if mistake))

  Another example:

         ; Is this a valid program?
         ; 
         ; It doesn't _really_ contain an infinite recursion
         ; 

         (define x (if (odd? 3) 7 x))

  how about:

         (define x (if (odd? 3) x 7))

  And there are always banal questions about error handling during
  execution:

          ; What is the value of `y'?
          ; (What does division-by-0 produce?)
          ; 
          (define y (/ 1 0))


          ; What is the value of `y'?
          ; (What does out-of-range access produce?)
          ; 
          (define a-sequence '(a b c))

          (define y (ref a-sequence 100))  ; the 100th element?!?


* The Problems with Implementing Expressions

  One of our goals is to keep xl implementations very tiny and very 
  simple.   Another goal is to make sure that xl is well-enough 
  defined that implementations are interchangable.   Finally, 
  another goal is to preserve the property that all programs execute
  in a finite number of steps.

  These raise the question: how can we define expressions in xl in
  such a way that we are:

        1. confident all xl programs terminate
        2. confident that a simple implementation is practical
        3. confident that we have explained with precision
           how a correct implementation behaves


* Solving Those Problems

  Lots of very advanced techniques exist for defining languages in
  ways that provide clear and provable answers for the "Problems
  Defining Expressions" and "Problems Implementing Expressions".

  We don't need anything terribly advanced precisely because our
  language is so "weak" (i.e., not turing complete).  Many of the hard
  language-theory problems that show up once we allow
  possibly-infinite loops don't come up in xl.

  An easy way to define xl while addressing all of these issues will
  be to define it _operationally_.   In short, I'll write what is 
  in essense a mathematical description of an xl interpreter -- but
  you'll see that the mathematical description translates very
  directly into actual running code.    (Perhaps, time permitting, 
  I'll just go ahead and write the description _as_ running code.)

  We'll see, when that's done, that it isn't hard to prove things such
  as the termination property of xl programs.

  We'll also see, when that's done, that yes, in fact, a tiny and
  simple interpreter is possible.


* Sneak Preview

  Roughly speaking, the definition of xl-with-expressions will be
  expressed as a virtual machine, furth, extended with some new
  primitive primitive instructions.   Loading an xl program (in the
  form of data) into one of the registers of this VM, followed by
  running the VM for some number of cycles, will, as a side effect,
  evaluate xl expressions and store the result back in a VM register.

  We'll be able to infer, by induction on the process of evaluation,
  what values are produced by expressions.

  The operational definition will turn out to be slightly
  _overspecified_: it will do more than just define the values
  produced by expressions.   In addition, the operational definition
  will (somewhat by "accident") define the _order_of_evaluation_ of
  expressions, letting us reason about what order various
  subexpressions are evaluated in under various circumstances.

  The order of evaluation is superfluous, at first glance.   None of
  the expressions in the examples above depend on the order of
  evaluation at all.   For example, in this:

        (define x (+ (* 3 3) (* 4 4)))

  it doesn't matter whether we evaluate `(* 3 3)' first or `(* 4 4)'
  first.

  However, I'll then consider a class of built-in functions for which
  order-of-evaluation _does_ matter: functions that perform side
  effects on the environment.  I'll argue that the order of evaluation
  rules for xl expressions make for a very nice way to control the
  order of side effects.  I'll demonstrate ways in which, without
  adding anything more to xl than expressions and side-effecting
  functions, we get not only a very expressive configuration language,
  but a language with many interesting applications....

-t

[Prev in Thread]

Current Thread

[Next in Thread]

[Gnu-arch-users] Round II -- new language, arch, furth, etc., Tom Lord <=
- Re: [Gnu-arch-users] Round II -- new language, arch, furth, etc., Colin Walters, 2004/07/20
  - Re: [Gnu-arch-users] Round II -- new language, arch, furth, etc., Tom Lord, 2004/07/20
  - [Gnu-arch-users] Re: Round II -- new language, arch, furth, etc., Miles Bader, 2004/07/20
    - Re: [Gnu-arch-users] Re: Round II -- new language, arch, furth, etc., Colin Walters, 2004/07/20
    - Re: [Gnu-arch-users] Re: Round II -- new language, arch, furth, etc., Tom Lord, 2004/07/20
    - [Gnu-arch-users] Re: Round II -- new language, arch, furth, etc., Miles Bader, 2004/07/20
    - Re: [Gnu-arch-users] Re: Round II -- new language, arch, furth, etc., Matthieu Moy, 2004/07/21
    - Re: [Gnu-arch-users] Re: Round II -- new language, arch, furth, etc., James Blackwell, 2004/07/21
    - Re: [Gnu-arch-users] Re: Round II -- new language, arch, furth, etc., Andrew Suffield, 2004/07/21
    - Re: [Gnu-arch-users] Re: Round II -- new language, arch, furth, etc., Matthieu Moy, 2004/07/21

Prev by Date: Re: [Gnu-arch-users] Help with firewalled box
Next by Date: Re: [Gnu-arch-users] Status of global and tree aliases
Previous by thread: [Gnu-arch-users] Help with firewalled box
Next by thread: Re: [Gnu-arch-users] Round II -- new language, arch, furth, etc.
Index(es):
- Date
- Thread