[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
status of "eval in scheme"
status of "eval in scheme"
Fri, 02 Oct 2009 19:19:43 -0700
Gnus/5.13 (Gnus v5.13) Emacs/23.0.92 (gnu/linux)
I haven't posted for a while about what I've been up to, so an update is
Some of you might have noticed a particular entry from Neil's summary of
the lovely meeting we had in the Pyrenees -- the one about implementing
`eval' in Scheme. I'd like to expound on that a bit.
As you probably know, now we have an interpreter and virtual machine.
Of course everything is an interpreter at some level, but our two
"engines" do have an important difference: one implements recursion via
recursive calls to a C function, and the other recurses via a private
stack -- the VM stack.
This causes a couple of problems. First of all, you can't tail-call from
interpreter to VM, or vice versa. Normally this isn't too bad, because
everything is compiled (and thus run in the VM), but you may have
noticed occasional. stack overflows rebuilding Guile after updating your
Guile checkout. This can happen e.g. if you are looping via a call to
`apply' or `call-with-values', but you're in interpreted code and
`apply' is compiled, or vice versa. It's an annoyance but we've been
able to work around it.
Secondly, we have to have (at least) two cases in all of our
introspection and debugging tools -- two differently-formatted stacks to
walk, two procedure formats, two (actually more) ways to implement a
procedure call, etc. When I started to think about implementing native
compilation, the resulting combinatoric explosion gave me a bit of
So what's the solution? Greater uniformity, in short. We'll achieve that
by ditching the C interpreter (almost) entirely. If `eval' is
implemented in Scheme, and thus compiled to the VM, all computation
stays in the VM -- giving us tail calls, uniform multiple-values
But wait, you say, Scheme implemented in Scheme? Isn't that a snake
eating its tail? Yes, indeed. There are two ways to break this cycle.
one is to require that the user have a working Scheme to compile Guile.
But that's really tricky to pull off. We'd either have to rewrite
Guile's compiler so that another Scheme could execute it, or otherwise
require a not-too-old version of Guile itself. Too hairy.
The second option is to bootstrap from the C compiler, using an
interpreter written in C *simply to compile eval.scm*. After that pass,
we have an eval.go -- and we never hit the C interpreter again.
There are some pitfalls, of course. The semantics of the Scheme `eval'
should match the semantics of the Scheme compiler, and the semantics of
the C `eval' -- and, it goes without saying, the semantics of the Scheme
standard. I submit that the best way forward is to write a new
interpretation algorithm, and to implement it in C and in Scheme.
This might not seem to be the obvious conclusion, but implementing a new
C interpreter has many upsides. I have always found eval.c (and
eval.i.c) to be horrendously complicated for it does. Lazy memoization
is complicated and not threadsafe. Taking a few thousand lines of C away
from libguile can only be a good thing.
The current C evaluator is also complicated because it has to detect
various syntax errors at runtime, due to lazy memoization. Separating
memoization and evaluation is a big win for simplicity, and it's not bad
for speed either -- the evaluator can accept /only/ memoized
expressions, and efficiently switch on the type of memoized expression.
The memoizer itself can be fairly simple, and implemented in C. It's
simple because it expects to see only expanded source -- the output of
psyntax. This has the corrolary that any code evaluated before psyntax
is loaded has to be expressed in already-expanded scheme -- you can have
`define', you can have `lambda', but you can't have e.g. `do' or `case'.
The output of the memoizer doesn't have to be Scheme at all -- in fact,
as I have it implemented, it's most similar to Tree-IL. Lexical
variables, for example, can be addressed by index and not by name -- all
the classic interpreter optimizations apply, but clearly.
The real coup is the new simplicity of eval.c, and its similarity to the
new eval.scm -- which, once compiled, solves the problems I mentioned in
the beginning of this long mail.
* * *
So that's the vision. In practice, there are a few things that need
doing before we can make the switch, all related to functionality
currently only implemented in the interpreter.
First of all, we have a profusion of different kinds of procedures.
Dispatch to all the different kinds of subrs is only performed by the
current C evaluator. What needs to happen is to simplify this set of
different kinds of procedures, and to implement them in the VM. The
first part is mostly finished in the "subr-simplification" branch,
except for rpsubrs. I have yet to pull it into the VM, though.
Secondly, GOOPS dispatch needs to be supported by the VM. (Currently, we
just call out to the interpreter). This is actually quite a task, as it
turns out. The issue is that Scheme has no way of dealing with arguments
on a stack in a first-class way. You can't write a procedure that takes
either 3 or 4 arguments without consing up a tail arg. So if you have a
generic function, with methods of differing arities, you'd have to cons
up the list of the arguments, and dispatch on that list.
Actually, Scheme does have a way to deal efficiently with multiple
arguments: SRFI-16's case-lambda. Guile needs to compile case-lambda to
appropriate bytecode in which the /callee/ -- the procedure itself --
dispatches on its arguments, potentially raising a wrong-num-args error
itself. (Currently in the VM, the caller checks the argument count.)
This entails a change in the calling convention, and thus in the
VM and the debugging tools, and a change in Tree-IL. It's a big one. I
have the low-level bits implemented in my case-lambda branch, but it's
going to be another week or two until it's baked.
Once we have case-lambda, we can implement some kind of schemely
typecase-lambda for the default "effective method" of a generic
function, pushing more of GOOPS' implementation into Scheme. This will
finally satisfy Jao, who has been bugging me to allow method
combinations in GOOPS for quite some time now ;)
Then once all that's done, we'll have all dispatch in the VM, and can
avoid calling the interpreter from the VM, allowing us to make the
switch in interpreters.
So the road is a little longer than I had hoped for when I first
proposed this path, but I think I can finish it before 1.9.5. I don't
think any of this will hit 1.9.4, because, well, time passes :) But I
just had a month of not so much Guilehack, so now I get to indulge
myself in more Guilehack for the next few weeks.
Well, that's that. I said I'd merge in Daniel's work too whenever Mark's
patch landed, but it might be worth it to merge it in sooner rather than
later. It seems the Emacs scene is starting to heat up :)
- status of "eval in scheme",
Andy Wingo <=