Re: CPS language and Tree-IL->CPS->RTL compiler

On Thu, Aug 29, 2013 at 3:49 AM, Andy Wingo <address@hidden> wrote:

Hi all,

This patchset implements the CPS compiler I've been working on over the
last couple months. At the end of it, you can

guild compile -t rtl foo.scm

and, if there are no bugs or unimplemented bits, at that point you have
a loadable .go with RTL code. (Try readelf -a on the .go!)

>From the console the way to do it is:

(use-modules (system vm objcode))
(load-thunk-from-memory (compile foo #:to 'rtl))

That gives you the thunk that, when called, will execute FOO.

So with that intro, some more on CPS and the RTL VM. As you know we
have a new register-based VM, where all values have names (slots). It's
appropriate that the language that compiles to this VM also give names
to all intermediate values. CPS has this property. In addition, CPS
gives names to all control points, effectively giving a label to each
_expression_.

This CPS language was inspired by Andrew Kennedy's great 2007 paper,
"Compiling with Continuations, Continued". In particular, continuations
are /local/ to a function, so they really are basic block labels, and
all values are bound to variables via continuation calls -- even
constants.

As a little example:

(+ 1 2)

Here + is a primcall, so we would get:

($letk ((kone ($kargs (oneb)
($letk ((ktwo ($kargs (two)
($continue ktail
($primcall + one two)))))
($continue ktwo ($const 2))))))
($continue kone ($const 1)))

Here all CPS language constructs are prefixed with "$". Everything else
is a variable, except the + in the primcall.

As you can see it is incredibly verbose. At the same time it's very
simple, as there are only two kinds of terms: terms that bind
continuations, and terms that call continuations.

$letk binds a set of mutually recursive continuations, each one an
instance of $cont. A $cont declares the name and source of a
continuation, and then contains as a subterm the particular
continuation instance: $kif for test continuations, $kargs for
continuations that bind values, etc.

$continue nodes call continuations. The _expression_ contained in the
$continue node determines the value or values that are passed to the
target continuation: $const to pass a constant value, $values to
pass multiple named values, etc.

Additionally there is $letrec, a term that binds mutually recursive
functions. The contification pass will turn $letrec into $letk if
it can do so. Otherwise, the closure conversion pass will desugar
$letrec into an equivalent sequence of make-closure primcalls and
subsequent initializations of the captured variables of the
closures. You can think of $letrec as pertaining to "high CPS",
whereas later passes will only see "low CPS", which does not have
$letrec.

There are a bunch of Guile-specific quirks in this language, mostly
related to function prologues for the different kinds of arities, and
for things like multiple-value truncation and prompts. Check out
(language cps) for all the deal.

So, after that patch is the Tree-IL->CPS compiler. It simplifies a
number of Tree-IL concepts, fixing argument order, turning toplevel
references to primcalls, transforming prompts, assignment conversion,
etc. For this reason it's a bit hairy, but it seems to work fine. This
compiler runs *after* optimization passes on Tree-IL, so you still have
peval that runs over tree-il.

After that follow a bunch of passes to build up an RTL compiler. The
idea is to compile by incremental source-to-source passes, and at the
end you just assign slots to all variables and emit code directly.

If you ever worked with the old (language tree-il compile-glil) module,
you know it's very hairy, mostly because it's mixing code emission with
semantic transformations. Nowhere is this more evident than the
so-called "labels allocation" strategy, in which we try to allocate
procedures as basic blocks, if they all return to the same place. It's
a funky pass. It turns out this concept is well-studied and has a name,
"contification", and is cleanly expressed as a source-to-source pass
over CPS. So compile-rtl.scm is dumb: it just loops over all
expressions, emitting code for each of them, and emitting jumps and
shuffling registers as needed (based on a prior analysis pass).

In the end you can ,disassemble the code, thanks to the earlier RTL
work.

I don't want people to benchmark this stuff yet -- it's buggy and not
optimized. Don't use this work for anything serious. But if you're in
to hacking on it, that's cool. There are a number of optimization
passes that are needed (see compile-rtl.scm for a list), but currently
it's looking like the RTL compiler does significantly better on loops,
but struggles to keep up with the old VM for calls. This is
unsurprising, as calls really do work like stacks, and a stack VM has
many advantages there. But we think that with better slot allocation we
can probably be faster for calls as well.

Thank you, thank you, thank you to Mark Weaver for helping out with this
work! Without him there would be many more bugs. I've folded all of
his patches into this patchset, so please consider him the joint author
of all of this. In any case the bugs have forced him to page much of
this code into his head so we are better equipped as a project because
of that ;-)

Thank you also to Noah Lavine, who did an earlier pass at CPS
conversion. I considered starting from his work but it became clear
that many aspects of CPS would be nicer with changes in Tree-IL -- so I
took advantage of my maintenance of the old compiler to make various
changes there and in other parts of the runtime to get a cleaner CPS
compiler. In the end we took advantage of his vanguard-hacking, though
as ideas rather than as code. Thank you Noah!

There are not as many tests and documentation as one would like.
Ultimately it all just came together in the last couple of weeks, so I
think the next step is to write more tests and try the compiler on new
pieces of code. There are a couple bugs that we know about at this
point, and surely many more that we don't know about. I ask for some
lenience on this front while we figure out what the compiler should look
like :)

OK. WDYT, Ludo? Comments? OK to merge? :)

Cheers,

Andy

From:	Noah Lavine
Subject:	Re: CPS language and Tree-IL->CPS->RTL compiler
Date:	Thu, 29 Aug 2013 17:52:51 -0400