guile-commits
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Guile-commits] 02/02: Update CPS language documentation


From: Andy Wingo
Subject: [Guile-commits] 02/02: Update CPS language documentation
Date: Thu, 17 Sep 2015 18:15:11 +0000

wingo pushed a commit to branch master
in repository guile.

commit d701e8a3d36cb18096042413bd60b0993845beea
Author: Andy Wingo <address@hidden>
Date:   Thu Sep 17 20:14:35 2015 +0200

    Update CPS language documentation
    
    * doc/ref/compiler.texi (Continuation-Passing Style): Update to latest
      CPS language.
---
 doc/ref/compiler.texi |  507 ++++++++++++++++++++++++++++++++++---------------
 1 files changed, 353 insertions(+), 154 deletions(-)

diff --git a/doc/ref/compiler.texi b/doc/ref/compiler.texi
index 9743c53..75fd4e5 100644
--- a/doc/ref/compiler.texi
+++ b/doc/ref/compiler.texi
@@ -513,12 +513,8 @@ Optimization passes performed on Tree-IL currently include:
 and calls to primitives to primcalls)
 @item Partial evaluation (comprising inlining, copy propagation, and
 constant folding)
address@hidden Common subexpression elimination (CSE)
 @end itemize
 
-In the future, we will move the CSE pass to operate over the lower-level
-CPS language.
-
 @node Continuation-Passing Style
 @subsection Continuation-Passing Style
 
@@ -534,6 +530,7 @@ compiler.
 * An Introduction to CPS::
 * CPS in Guile::
 * Building CPS::
+* CPS Soup::
 * Compiling CPS::
 @end menu
 
@@ -624,12 +621,57 @@ details manifest, and gives them names.
 @node CPS in Guile
 @subsubsection CPS in Guile
 
-Guile's CPS language is composed of @dfn{terms}, @dfn{expressions},
-and @dfn{continuations}.
address@hidden continuation, CPS
+Guile's CPS language is composed of @dfn{continuations}.  A continuation
+is a labelled program point.  If you are used to traditional compilers,
+think of a continuation as a trivial basic block.  A program is a
+``soup'' of continuations, represented as a map from labels to
+continuations.
+
address@hidden term, CPS
address@hidden expression, CPS
+Like basic blocks, each continuation belongs to only one function.  Some
+continuations are special, like the continuation corresponding to a
+function's entry point, or the continuation that represents the tail of
+a function.  Others contain a @dfn{term}.  A term contains an
address@hidden, which evaluates to zero or more values.  The term also
+records the continuation to which it will pass its values.  Some terms,
+like conditional branches, may continue to one of a number of
+continuations.
+
+Continuation labels are small integers.  This makes it easy to sort them
+and to group them into sets.  Whenever a term refers to a continuation,
+it does so by name, simply recording the label of the continuation.
+Continuation labels are unique among the set of labels in a program.
+
+Variables are also named by small integers.  Variable names are unique
+among the set of variables in a program.
+
+For example, a simple continuation that receives two values and adds
+them together can be matched like this, using the @code{match} form from
address@hidden(ice-9 match)}:
+
address@hidden
+(match cont
+  (($ $kargs (x-name y-name) (x-var y-var)
+      ($ $continue k src ($ $primcall '+ (x-var y-var))))
+   (format #t "Add ~a and ~a and pass the result to label ~a"
+           x-var y-var k)))
address@hidden smallexample
+
+Here we see the most common kind of continuation, @code{$kargs}, which
+binds some number of values to variables and then evaluates a term.
+
address@hidden {CPS Continuation} $kargs names vars term
+Bind the incoming values to the variables @var{vars}, with original
+names @var{names}, and then evaluate @var{term}.
address@hidden deftp
+
+The @var{names} of a @code{$kargs} are just for debugging, and will end
+up residualized in the object file for use by the debugger.
 
-A term can either evaluate an expression and pass the resulting values
-to some continuation, or it can declare local continuations and contain
-a sub-term in the scope of those continuations.
+The @var{term} in a @code{$kargs} is always a @code{$continue}, which
+evaluates an expression and continues to a continuation.
 
 @deftp {CPS Term} $continue k src exp
 Evaluate the expression @var{exp} and pass the resulting values (if any)
@@ -639,44 +681,33 @@ as in @code{source-properties} or is @code{#f} if there 
is no associated
 source.
 @end deftp
 
address@hidden {CPS Term} $letk conts body
-Bind @var{conts}, a list of continuations (@code{$cont} instances), in
-the scope of the sub-term @var{body}.  The continuations are mutually
-recursive.
address@hidden deftp
+There are a number of expression kinds.  Above you see an example of
address@hidden
 
-Additionally, the early stages of CPS allow for a set of mutually
-recursive functions to be declared as a term.  This @code{$letrec} type
-is like Tree-IL's @code{<fix>}.  The contification pass will attempt to
-transform the functions declared in a @code{$letrec} into local
-continuations.  Any remaining functions are later lowered to @code{$fun}
-expressions.
-
address@hidden {CPS Term} $letrec names syms funs body
-Declare the mutually recursive set of functions denoted by @var{names},
address@hidden, and @var{funs} within the sub-term @var{body}.  @var{names}
-and @var{syms} are lists of symbols, and @var{funs} is a list of
address@hidden values.  @var{syms} are globally unique.
address@hidden {CPS Expression} $primcall name args
+Perform the primitive operation identified by @code{name}, a well-known
+symbol, passing it the arguments @var{args}, and pass all resulting
+values to the continuation.  The set of available primitives includes
+all primitives known to Tree-IL and then some more; see the source code
+for details.
 @end deftp
 
-A higher-order CPS program is a @code{$cont} containing a @code{$kfun}
-(see below), and the @code{$kfun} which contains clauses and those
-clauses contain terms.  A first-order CPS program, on the other hand, is
-the result of closure conversion and does not contain nested functions.
-Closure conversion lifts code for all functions up to the top, collects
-their entry continuations as a list of @code{$cont} @code{$kfun}
-instances and binds them in a @code{$program}.
-
address@hidden {CPS Term} $program funs
-A first-order CPS term declaring a recursive scope for first-order
-functions in a compilation unit.  @var{funs} is a list of @code{$cont}
address@hidden instances.  The first entry in the list is the entry
-function for the program.
address@hidden deftp
address@hidden dominate, CPS
+The variables that are used by @code{$primcall}, or indeed by any
+expression, must be defined before the expression is evaluated.  An
+equivalent way of saying this is that predecessor @code{$kargs}
+continuation(s) that bind the variables(s) used by the expression must
address@hidden the continuation that uses the expression: definitions
+dominate uses.  This condition is trivially satisfied in our example
+above, but in general to determine the set of variables that are in
+``scope'' for a given term, you need to do a flow analysis to see what
+continuations dominate a term.  The variables that are in scope are
+those variables defined by the continuations that dominate a term.
 
 Here is an inventory of the kinds of expressions in Guile's CPS
-language.  Recall that all expressions are wrapped in a @code{$continue}
-term which specifies their continuation.
+language, besides @code{$primcall} which has already been described.
+Recall that all expressions are wrapped in a @code{$continue} term which
+specifies their continuation.
 
 @deftp {CPS Expression} $const val
 Continue with the constant value @var{val}.
@@ -687,47 +718,11 @@ Continue with the procedure that implements the primitive 
operation
 named by @var{name}.
 @end deftp
 
address@hidden {CPS Expression} $fun free body
-Continue with a procedure.  @var{free} is a list of free variables
-accessed by the procedure.  Early CPS uses an empty list for @var{free};
-only after closure conversion is it correctly populated.  Finally,
address@hidden is the @code{$kfun} @code{$cont} of the procedure entry.
address@hidden deftp
-
address@hidden is part of higher-level CPS.  After closure conversion,
address@hidden instances are given a concrete representation.  By default,
-a closure is represented as an object built by a @code{$closure}
-expression
-
address@hidden {CPS Expression} $closure label nfree
-Build a closure that joins the code at the continuation named
address@hidden with space for @var{nfree} free variables.  The variables
-will be initialized later via @code{free-variable-set!} primcalls.
address@hidden deftp
-
-If the closure can be proven to never escape its scope then other
-lighter-weight representations can be chosen.
-
 @deftp {CPS Expression} $call proc args
address@hidden {CPS Expression} $callk label proc args
 Call @var{proc} with the arguments @var{args}, and pass all values to
 the continuation.  @var{proc} and the elements of the @var{args} list
 should all be variable names.  The continuation identified by the term's
 @var{k} should be a @code{$kreceive} or a @code{$ktail} instance.
-
address@hidden is for the case where the call target is known to be in
-the same compilation unit.  @var{label} should be some continuation
-label, though it need not be in scope.  In this case the @var{proc} is
-simply an additional argument, since it is not used to determine the
-call target at run-time.
address@hidden deftp
-
address@hidden {CPS Expression} $primcall name args
-Perform the primitive operation identified by @code{name}, a well-known
-symbol, passing it the arguments @var{args}, and pass all resulting
-values to the continuation.  The set of available primitives includes
-all primitives known to Tree-IL and then some more; see the source code
-for details.
 @end deftp
 
 @deftp {CPS Expression} $values args
@@ -736,7 +731,8 @@ Pass the values named by the list @var{args} to the 
continuation.
 
 @deftp {CPS Expression} $branch kt exp
 Evaluate the branching expression @var{exp}, and continue to @var{kt}
-with zero values if the test evaluates to true.  Otherwise, in the false
+with zero values if the test evaluates to true.  Otherwise continue to
+the continuation named in the outer @code{$continue} term.
 
 Only certain expressions are valid in a @var{$branch}.  Compiling a
 @code{$branch} avoids allocating space for the test variable, so the
@@ -744,9 +740,9 @@ expression should be evaluatable without temporary values.  
In practice
 this condition is true for @code{$primcall}s to @code{null?}, @code{=},
 and similar primitives that have corresponding @address@hidden VM
 operations; see the source code for full details.  When in doubt, bind
-the test expression to a variable, and reference the variable in the
address@hidden expression.  The optimizer should inline the reference if
-possible.
+the test expression to a variable, and branch on a @code{$values}
+expression that references that variable.  The optimizer should inline
+the reference if possible.
 @end deftp
 
 @deftp {CPS Expression} $prompt escape? tag handler
@@ -758,30 +754,73 @@ the continuation labelled @var{handler}, which should be a
 @code{pop-prompt} primcalls.
 @end deftp
 
-The remaining element of the CPS language in Guile is the continuation.
-In CPS, all continuations have unique labels.  Since this aspect is
-common to all continuation types, all continuations are contained in a
address@hidden instance:
address@hidden higher-order CPS
address@hidden CPS, higher-order
address@hidden first-order CPS
address@hidden CPS, first-order
+There are two sub-languages of CPS, @dfn{higher-order CPS} and
address@hidden CPS}.  The difference is that in higher-order CPS,
+there are @code{$fun} and @code{$rec} expressions that bind functions or
+mutually-recursive functions in the implicit scope of their use sites.
+Guile transforms higher-order CPS into first-order CPS by @dfn{closure
+conversion}, which chooses representations for all closures and which
+arranges to access free variables through the implicit closure parameter
+that is passed to every function call.
+
address@hidden {CPS Expression} $fun body
+Continue with a procedure.  @var{body} names the entry point of the
+function, which should be a @code{$kfun}.  This expression kind is only
+valid in higher-order CPS, which is the CPS language before closure
+conversion.
address@hidden deftp
 
address@hidden {CPS Continuation Wrapper} $cont k cont
-Declare a continuation labelled @var{k}.  All references to the
-continuation will use this label.
address@hidden {CPS Expression} $rec names vars funs
+Continue with a set of mutually recursive procedures denoted by
address@hidden, @var{vars}, and @var{funs}.  @var{names} is a list of
+symbols, @var{vars} is a list of variable names (unique integers), and
address@hidden is a list of @code{$fun} values.  Note that the @code{$kargs}
+continuation should also define @var{names}/@var{vars} bindings.
 @end deftp
 
-The most common kind of continuation binds some number of values, and
-then evaluates a sub-term.  @code{$kargs} is this kind of simple
address@hidden
+The contification pass will attempt to transform the functions declared
+in a @code{$rec} into local continuations.  Any remaining @code{$fun}
+instances are later removed by the closure conversion pass.  By default,
+a closure is represented as an object built by a @code{$closure}
+expression.
+
address@hidden {CPS Expression} $closure label nfree
+Build a closure that joins the code at the continuation named
address@hidden with space for @var{nfree} free variables.  The variables
+will be initialized later via @code{free-set!} primcalls.  This
+expression kind is part of first-order CPS.
address@hidden deftp
 
address@hidden {CPS Continuation} $kargs names syms body
-Bind the incoming values to the variables @var{syms}, with original
-names @var{names}, and then evaluate the sub-term @var{body}.
+If the closure can be proven to never escape its scope then other
+lighter-weight representations can be chosen.  Additionally, if all call
+sites are known, closure conversion will hard-wire the calls by lowering
address@hidden to @code{$callk}.
+
address@hidden {CPS Expression} $callk label proc args
+Like @code{$call}, but for the case where the call target is known to be
+in the same compilation unit.  @var{label} should denote some
address@hidden continuation in the program.  In this case the @var{proc}
+is simply an additional argument, since it is not used to determine the
+call target at run-time.
 @end deftp
 
-Variable names (the names in the @var{syms} of a @code{$kargs}) should
-be unique among all other variable names.  To bind a value to a variable
-and then evaluate some term, you would continue with the value to a
address@hidden that declares one variable.  The bound value would then be
-available for use within the body of the @code{$kargs}.
+At this point we have described terms, expressions, and the most common
+kind of continuation, @code{$kargs}.  @code{$kargs} is used when the
+predecessors of the continuation can be instructed to pass the values
+where the continuation wants them.  For example, if a @code{$kargs}
+continuation @var{k} binds a variable @var{v}, and the compiler decides
+to allocate @var{v} to slot 6, all predecessors of @var{k} should put
+the value for @var{v} in slot 6 before jumping to @var{k}.  One
+situation in which this isn't possible is receiving values from function
+calls.  Guile has a calling convention for functions which currently
+places return values on the stack.  A continuation of a call must check
+that the number of values returned from a function matches the expected
+number of values, and then must shuffle or collect those values to named
+variables.  @code{$kreceive} denotes this kind of continuation.
 
 @deftp {CPS Continuation} $kreceive arity k
 Receive values on the stack.  Parse them according to @var{arity}, and
@@ -806,18 +845,18 @@ Note that all of these names with the exception of the 
@var{var}s in the
 @var{kw} list are source names, not unique variable names.
 @end deftp
 
-Additionally, there are three specific kinds of continuations that can
-only be declared at function entries.
+Additionally, there are three specific kinds of continuations that are
+only used in function entries.
 
 @deftp {CPS Continuation} $kfun src meta self tail clauses
 Declare a function entry.  @var{src} is the source information for the
 procedure declaration, and @var{meta} is the metadata alist as described
 above in Tree-IL's @code{<lambda>}.  @var{self} is a variable bound to
 the procedure being called, and which may be used for self-references.
address@hidden declares the @code{$cont} wrapping the @code{$ktail} for this
-function, corresponding to the function's tail continuation.
address@hidden is the first @code{$kclause} @code{$cont} instance for the
-first @code{case-lambda} clause in the function, or otherwise @code{#f}.
address@hidden is the label of the @code{$ktail} for this function,
+corresponding to the function's tail continuation.  @var{clause} is the
+label of the first @code{$kclause} for the first @code{case-lambda}
+clause in the function, or otherwise @code{#f}.
 @end deftp
 
 @deftp {CPS Continuation} $ktail
@@ -826,10 +865,10 @@ A tail continuation.
 
 @deftp {CPS Continuation} $kclause arity cont alternate
 A clause of a function with a given arity.  Applications of a function
-with a compatible set of actual arguments will continue to @var{cont}, a
address@hidden @code{$cont} instance representing the clause body.  If
-the arguments are incompatible, control proceeds to @var{alternate},
-which is a @code{$kclause} @code{$cont} for the next clause, or
+with a compatible set of actual arguments will continue to the
+continuation labelled @var{cont}, a @code{$kargs} instance representing
+the clause body.  If the arguments are incompatible, control proceeds to
address@hidden, which is a @code{$kclause} for the next clause, or
 @code{#f} if there is no next clause.
 @end deftp
 
@@ -842,41 +881,41 @@ constructors or accessors, or instead of S-expression 
matching.
 
 Deconstruction and matching is handled adequately by the @code{match}
 form from @code{(ice-9 match)}.  @xref{Pattern Matching}.  Construction
-is handled by a set of mutually recursive builder macros:
address@hidden, @code{build-cps-cont}, and @code{build-cps-exp}.
-
-In the following interface definitions, consider variables containing
address@hidden to be recursively build by @code{build-cps-cont}, and
-likewise for @code{term} and @code{exp}.  Consider any other name to be
-evaluated as a Scheme expression.  Many of these forms recognize
address@hidden in some contexts, to splice in a previously-built value;
-see the specifications below for full details.
-
address@hidden {Scheme Syntax} build-cps-term ,val
address@hidden {Scheme Syntax} build-cps-term ($letk (cont ...) term)
address@hidden {Scheme Syntax} build-cps-term ($letrec names syms funs term)
address@hidden {Scheme Syntax} build-cps-term ($continue k src exp)
address@hidden {Scheme Syntax} build-cps-term ($program conts)
address@hidden {Scheme Syntax} build-cps-exp ,val
address@hidden {Scheme Syntax} build-cps-exp ($const val)
address@hidden {Scheme Syntax} build-cps-exp ($prim name)
address@hidden {Scheme Syntax} build-cps-exp ($fun src meta free body)
address@hidden {Scheme Syntax} build-cps-exp ($call proc (arg ...))
address@hidden {Scheme Syntax} build-cps-exp ($call proc args)
address@hidden {Scheme Syntax} build-cps-exp ($primcall name (arg ...))
address@hidden {Scheme Syntax} build-cps-exp ($primcall name args)
address@hidden {Scheme Syntax} build-cps-exp ($values (arg ...))
address@hidden {Scheme Syntax} build-cps-exp ($values args)
address@hidden {Scheme Syntax} build-cps-exp ($prompt escape? tag handler)
address@hidden {Scheme Syntax} build-cps-cont ,val
address@hidden {Scheme Syntax} build-cps-cont (k ($kargs (name ...) (sym ...) 
term))
address@hidden {Scheme Syntax} build-cps-cont (k ($kargs names syms term))
address@hidden {Scheme Syntax} build-cps-cont (k ($kif kt kf))
address@hidden {Scheme Syntax} build-cps-cont (k ($kreceive req rest kargs))
address@hidden {Scheme Syntax} build-cps-cont (k ($kentry self tail-cont 
,clauses))
address@hidden {Scheme Syntax} build-cps-cont (k ($kentry self tail-cont (cont 
...)))
address@hidden {Scheme Syntax} build-cps-cont (k ($kclause ,arity cont))
address@hidden {Scheme Syntax} build-cps-cont (k ($kclause (req opt rest kw 
aok?) cont))
+is handled by a set of mutually builder macros:
address@hidden, @code{build-cont}, and @code{build-exp}.
+
+In the following interface definitions, consider @code{term} and
address@hidden to be built by @code{build-term} or @code{build-exp},
+respectively.  Consider any other name to be evaluated as a Scheme
+expression.  Many of these forms recognize @code{unquote} in some
+contexts, to splice in a previously-built value; see the specifications
+below for full details.
+
address@hidden {Scheme Syntax} build-term ,val
address@hidden {Scheme Syntax} build-term ($continue k src exp)
address@hidden {Scheme Syntax} build-exp ,val
address@hidden {Scheme Syntax} build-exp ($const val)
address@hidden {Scheme Syntax} build-exp ($prim name)
address@hidden {Scheme Syntax} build-exp ($branch kt exp)
address@hidden {Scheme Syntax} build-exp ($fun kentry)
address@hidden {Scheme Syntax} build-exp ($rec names syms funs)
address@hidden {Scheme Syntax} build-exp ($closure k nfree)
address@hidden {Scheme Syntax} build-exp ($call proc (arg ...))
address@hidden {Scheme Syntax} build-exp ($call proc args)
address@hidden {Scheme Syntax} build-exp ($callk k proc (arg ...))
address@hidden {Scheme Syntax} build-exp ($callk k proc args)
address@hidden {Scheme Syntax} build-exp ($primcall name (arg ...))
address@hidden {Scheme Syntax} build-exp ($primcall name args)
address@hidden {Scheme Syntax} build-exp ($values (arg ...))
address@hidden {Scheme Syntax} build-exp ($values args)
address@hidden {Scheme Syntax} build-exp ($prompt escape? tag handler)
address@hidden {Scheme Syntax} build-cont ,val
address@hidden {Scheme Syntax} build-cont ($kargs (name ...) (sym ...) term)
address@hidden {Scheme Syntax} build-cont ($kargs names syms term)
address@hidden {Scheme Syntax} build-cont ($kreceive req rest kargs)
address@hidden {Scheme Syntax} build-cont ($kfun src meta self ktail kclause)
address@hidden {Scheme Syntax} build-cont ($kclause ,arity kbody kalt)
address@hidden {Scheme Syntax} build-cont ($kclause (req opt rest kw aok?) 
kbody)
 Construct a CPS term, expression, or continuation.
 @end deffn
 
@@ -886,19 +925,179 @@ There are a few more miscellaneous interfaces as well.
 A procedural constructor for @code{$arity} objects.
 @end deffn
 
address@hidden {Scheme Syntax} let-gensyms (sym ...) body ...
-Bind @var{sym...} to fresh names, and evaluate @var{body...}.
address@hidden deffn
-
address@hidden {Scheme Syntax} rewrite-cps-term val (pat term) ...
address@hidden {Scheme Syntax} rewrite-cps-exp val (pat exp) ...
address@hidden {Scheme Syntax} rewrite-cps-cont val (pat cont) ...
address@hidden {Scheme Syntax} rewrite-term val (pat term) ...
address@hidden {Scheme Syntax} rewrite-exp val (pat exp) ...
address@hidden {Scheme Syntax} rewrite-cont val (pat cont) ...
 Match @var{val} against the series of patterns @var{pat...}, using
 @code{match}.  The body of the matching clause should be a template in
-the syntax of @code{build-cps-term}, @code{build-cps-exp}, or
address@hidden, respectively.
+the syntax of @code{build-term}, @code{build-exp}, or @code{build-cont},
+respectively.
 @end deffn
 
address@hidden CPS Soup
address@hidden CPS Soup
+
+We describe programs in Guile's CPS language as being a kind of ``soup''
+because all continuations in the program are mixed into the same
+``pot'', so to speak.  A program in CPS is a map from continuation
+labels to continuation values.  As discussed in the introduction, a
+continuation label is an integer.  No label may be negative.
+
+As a matter of convention, label 0 should map to the @code{$kfun}
+continuation of the entry to the program, which should be a function of
+no arguments.  The body of a function consists of the labelled
+continuations that are reachable from the function entry.  A program can
+refer to other functions, either via @code{$fun} and @code{$rec} in
+higher-order CPS, or via @code{$closure} and @code{$callk} in
+first-order CPS.  The program logically contains all continuations of
+all functions reachable from the entry function.  A compiler pass may
+leave unreachable continuations in a program, but analyses should
+usually either apply only to reachable continuations, or should make
+translations that are oblivious as to whether a continuation is
+reachable or not.
+
address@hidden intmap
+The map itself is implemented as an @dfn{intmap}, a functional
+array-mapped trie specialized for integer keys.  Currently intmaps are a
+private data structure only used by the CPS phase of the compiler.  To
+work with intmaps, load the @code{(language cps intmap)} module:
+
address@hidden
+(use-modules (language cps intmap))
address@hidden example
+
+Intmaps are functional data structures, so there is no constructor as
+such: one can simply start with the empty intmap and add entries to it.
+
address@hidden
+(intmap? empty-intmap) @result{} #t
+(define x (intmap-add empty-intmap 42 "hi"))
+(intmap? x) @result{} #t
+(intmap-ref x 42) @result{} "hi"
+(intmap-ref x 43) @result{} @i{error: 43 not present}
+(intmap-ref x 43 (lambda (k) "yo!")) @result{} "yo"
+(intmap-add x 42 "hej") @result{} @i{error: 42 already present}
address@hidden example
+
address@hidden and @code{intmap-add} are the core of the intmap
+interface.  There is also @code{intmap-replace}, which replaces the
+value associated with a given key, requiring that the key was present
+already, and @code{intmap-remove}, which removes a key from an intmap.
+
+Intmaps have a tree-like structure that is well-suited to set operations
+such as union and intersection, so there is are also the binary
address@hidden and @code{intmap-intersect} procedures.  If the
+result is equivalent to either argument, that argument is returned
+as-is; in that way, one can detect whether the set operation produced a
+new result simply by checking with @code{eq?}.  This makes intmaps
+useful when computing fixed points.
+
+If a key is present in both intmaps and the key is not the same in the
+sense of @code{eq?}, the resulting value is determined by a ``meet''
+procedure, which is the optional last argument to @code{intmap-union},
address@hidden, and also to @code{intmap-add},
address@hidden, and similar functions.  The meet procedure will
+be called with the two values and should return the intersected or
+unioned value in some appropriate way.  If no meet procedure is given,
+the default meet procedure will raise an error.
+
+To traverse over the set of values in an intmap, there are the
address@hidden and @code{intmap-prev} procedures.  For example, if
+intmap @var{x} has one entry mapping 42 to some value, we would have:
+
address@hidden
+(intmap-next x) @result{} 42
+(intmap-next x 0) @result{} 42
+(intmap-next x 42) @result{} 42
+(intmap-next x 43) @result{} #f
+(intmap-prev x) @result{} 42
+(intmap-prev x 42) @result{} 42
+(intmap-prev x 41) @result{} #f
address@hidden example
+
+There is also the @code{intmap-fold} procedure, which folds over keys
+and values in the intmap from lowest to highest value, and
address@hidden which does so in the opposite direction.  These
+procedures may take up to 3 seed values.  The number of values that the
+fold procedure returns is the number of seed values.
+
address@hidden
+(define q (intmap-add (intmap-add empty-intmap 1 2) 3 4))
+(intmap-fold acons q '()) @result{} ((3 . 4) (1 . 2))
+(intmap-fold-right acons q '()) @result{} ((1 . 2) (3 . 4))
address@hidden example
+
+When an entry in an intmap is updated (removed, added, or changed), a
+new intmap is created that shares structure with the original intmap.
+This operation ensures that the result of existing computations is not
+affected by future computations: no mutation is ever visible to user
+code.  This is a great property in a compiler data structure, as it lets
+us hold a copy of a program before a transformation and use it while we
+build a post-transformation program.
+
+However, the allocation costs are sometimes too much, especially in
+cases when we know that we can just update the intmap in place.  As an
+example, say we have an intmap mapping the integers 1 to 100 to the
+integers 42 to 141.  Let's say that we want to transform this map by
+adding 1 to each value.  There is already an efficient @code{intmap-map}
+procedure in the @code{(language cps utils}) module, but if we didn't
+know about that we might do:
+
address@hidden
+(define (intmap-increment map)
+  (let lp ((k 0) (map map))
+    (let ((k (intmap-next map k)))
+      (if k
+          (let ((v (intmap-ref map k)))
+            (lp (1+ k) (intmap-replace map k (1+ v))))
+          map))))
address@hidden example
+
address@hidden intmap, transient
address@hidden transient intmaps
+Observe that the intermediate values created by @code{intmap-replace}
+are completely invisible to the program -- only the last result of
address@hidden value is needed.  The rest might as well share
+state with the last one, and we could update in place.  Guile allows
+this kind of interface via @dfn{transient intmaps}, inspired by
+Clojure's transient interface (@uref{http://clojure.org/transients}).
+
+The @code{intmap-add!} and @code{intmap-replace!} procedures return
+transient intmaps.  If one of these in-place procedures is called on a
+normal persistent intmap, a new transient intmap is created.  This is an
+O(1) operation.  In all other respects the interface is like their
+persistent counterparts, @code{intmap-add} and @code{intmap-replace}.
+
+If an in-place procedure is called on a transient intmap, the intmap is
+mutated in-place and the same value is returned.  If a persistent
+operation like @code{intmap-add} is called on a transient intmap, the
+transient's mutable substructure is then marked as persistent, and
address@hidden then runs on a new persistent intmap sharing structure
+but not state with the original transient.  Mutating a transient will
+cause enough copying to ensure that it can make its change, but if part
+of its substructure is already ``owned'' by it, no more copying is
+needed.
+
+We can use transients to make @code{intmap-increment} more efficient.
+The two changed elements have been marked @strong{like this}.
+
address@hidden
+(define (intmap-increment map)
+  (let lp ((k 0) (map map))
+    (let ((k (intmap-next map k)))
+      (if k
+          (let ((v (intmap-ref map k)))
+            (lp (1+ k) (@strong{intmap-replace!} map k (1+ v))))
+          (@strong{persistent-intmap} map)))))
address@hidden example
+
+Be sure to tag the result as persistent using the
address@hidden procedure to prevent the mutability from
+leaking to other parts of the program.  For added paranoia, you could
+call @code{persistent-intmap} on the incoming map, to ensure that if it
+were already transient, that the mutations in the body of
address@hidden wouldn't affect the incoming value.
+
 @node Compiling CPS
 @subsubsection Compiling CPS
 



reply via email to

[Prev in Thread] Current Thread [Next in Thread]