[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: CPS and RTL

From: Stefan Israelsson Tampe
Subject: Re: CPS and RTL
Date: Sat, 17 Nov 2012 21:18:39 +0100

Hi Noha,

very clean code indeed, nice work, (mine rtl compiler is much more a bit of a hack btw :-))

One thing to think about from the beginning is how to protect variables from gc and at the same time
offer unreachable vars to be gc:d at an early state. In guile-2.0 we kept using a stack for e.g. the middle terms
in the executions of expressions like (+ (+ x y) (+ w z)) and there the end stack pointer was updated after each
primitive. This is an overhead and in the rtl code wingo's trying to remove the need for a sp register in hardware
and in stead occasionally update a vp->sp value probably sitting in the memory cache. The problem with this is that now
we must take care to occasionally write over memory regions in order to allow the contents therein being gc'd at an
early state. This means that we must record which registers have been used between the erasing's. Now what I tried
to use is that the activities for which we will need to have well defined sp is mostly quite expensive so we can allow to
update it then. Then looking at instructions like (add dst argx argy) in most cases the place to put in vp->sp is the dst
location, but not always, I took the path to keep a base sp memory location which we could call bsp and I stored it in
vp->sp. Now when the add need to go out to the slow path, which may cons, it will check if dst is below bsp, if so it will use bsp as the
sp position, else it will use dsp e.g. it will temporary set vp->sp to &fp[dst]. when executing the slow path. This method
will mean that we need at times set the bsp as the code flows, but it will not be many places in practice, and they will most often be out of the hot path, so I added a (set-sp pos) rtl vm-op and tried with that. In all this means that I got a very effective
way of keeping track of the end of the stack with the same or better accuracy as before and it's really fast and there is no need to keep vp->spĀ  in a hardware register. The problem is that the code base got more comlpex both in C land and in scheme and I needed to store the bsp across function call's along with fp because bsp could be pointing at somewhere
other the start of the next function frame. We could solve this on the other hand by demanding the destination of a call
always sit's at the end of the sp and add a mov operation after teh call if needed instead, meaning that we can save on the amount of storage needed to be done on each frame.

Anyway my message here is that one need to make sure that the register allocator play's nicely with the mechanism of marking the region eligible for gc or not for gc.

Another thing to look out for is the tail call's and berhaps multiple value returns, You will need to fill in the start of the frame
with the at say register 0 and make sure that if it's not used later just overwrite or if it's used later move it and use that
location for that's register location (actually named let is another similar activity where you do a goto back to the start but then 0 is not the start of the registers to mangle. Maybe You solved this. And one can do that by just parsing the continuation of that call to see if there is a reference to it. If one chooses that it can be ok, I choose to delay the evaluations of the actual register number to a later phase by in stead of numbers store (lambda () (+ (d) i)), where is is the number I would have chosen if there was no problem and d is a function who's value is determined later by later usages of the same variable identity. I think that this is overly complicated and would today just go after a simple scan of the continuation.

I will try to comment more later.


On Sat, Nov 17, 2012 at 4:35 PM, Noah Lavine <address@hidden> wrote:

I've had several conversations on this list about using continuation-passing style in Guile. I recently decided to take the hint and implement it. I've pushed a new branch called wip-rtl-cps that I'd appreciate comments on (but I do not necessarily think that this is the branch that will be merged with master - it's still pretty rough).

The branch contains a CPS data structure based on the paper "Compiling with Continuations, Continued", by Andrew Kennedy. It lives in module/language/cps.scm. The branch is based on the RTL branch, and there's a CPS->RTL compiler in module/language/cps/compile-rtl.scm. There are also tests for it in test-suite/tests/cps.test. Here are some notes:

- It may have been a mistake to do a CPS->RTL compiler first. Even if we want to use RTL eventually, it probably would have been more sensible to do CPS->GLIL first, so the change would be independent of the switch to RTL. However, the code is already here - we can use it as-is, or change to GLIL, or something else.

- There's no Tree-IL->CPS compiler yet, mostly because I'm confident that we can do it, so I figured we could always add it later.

- There are three major language features missing from the compiler: top-level variables, closures, and anything to do with the dynamic environment. Those are all pretty major features - that's why I said this was rough.

- Right now the register allocation pass operates directly on the CPS. I have been thinking about making some higher-order functions for traversing it, similar to tree-il-fold, but I haven't done it yet.

Basically, I have a very rough compiler. I'd like to show it to people now, even though it's not fully formed, so that you can point out any issues, and so that we can see if this is something that could eventually wind up in mainline Guile. I'd rather fix problems now than later on. (And of course I would appreciate help from anyone else who wants to contribute.)


reply via email to

[Prev in Thread] Current Thread [Next in Thread]