[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [epsilon-devel] Questions about Epsilon and call-stack introspection
Re: [epsilon-devel] Questions about Epsilon and call-stack introspection.
Sat, 04 Jan 2014 16:59:10 +0100
Gnus (Ma Gnus v0.8), GNU Emacs 22.214.171.124, x86_64-unknown-linux-gnu
Hello Basile. Sorry for answering with a little delay, but I didn't
want to be too tired when writing this.
As I told you, I was planning to write a blog post on this, but I'm very
happy that now we're using the mailing list instead. This way people
can see that development is alive, and I can write informally without
being slowed down by my unproductive perfectionism.
On 2014-01-02 at 12:41, Basile Starynkevitch wrote:
> Happy new 2014 year to everyone.
Thanks. Same on you.
> First, I am impatiently waiting for Epsilon to get totally rid of Guile.
I'll try to work on it this evening and tomorrow as well.
The way I describe things below may or may not be accurate with respect
to who proposed what for the first time; let's say it's just the way I
organize things in my brain, when I ponder about these questions.
Basile has known epsilon for a long time, and he's always given me
friendly feedback. In particular we spoke about all these questions at
some length at a seminar of mine about epsilon last October, and it's
not unlikely that he/we touched the same topics before as well.
Now that I think of it, Basile also advocated the idea of running epsilon
on the bare metal, possibly before I did. (However I've been planning
to run epsilon on a small machine without a C runtime at least since
March 2013; you can find a Usenet post of mine, written in my usual
> Then, I really think that Epsilon should be a language in which one can
> code its own (e.g. copying generational Cheney like) garbage collector.
> Very few languages have this ability (and it is a shame, see however
> Scheme48 and its PreScheme). To make that possible, one has to have
> primitive to inspect and update call frames on the stack.
Let me make this proposal a little more precise, the way I see it.
I start with "the minor stack change": primitives should have access to
the stack in order to read and update boxed objects saved on the stack.
Pros: the ability of writing memory managers in epsilon itself, making
the "external" C runtime even smaller. Particularly good for
small embedded systems, which indeed I would like to support;
Cons: a relatively small semantic change.
The minor change entails a very small change to the semantics of
epsilon0: namely, primitives should receive the current stack as an
additional parameter. The new parameter shall only be visible at the
semantic level, and shall make the stack similar to the global state
(uppercase gamma in the semantics of Chapter 2) in this respect;
however, primitives shall still *not* return an updated stack, as part
of the minor stack change.
Of course most primitives would not access the stack, to keep things
clear; we might have just one new primitive taking advantage of the new
provision. I propose this:
Zero parameters, two results: the current-thread stack as a buffer,
and the current stack height.
In practice it's unacceptable for efficiency reasons to require that all
values be always stored on the stack; for this reason we need a new
epsilon0 language form to selectively force that when required.
Unfortunately it can't be simply a primitive, because it has effects on
how the surrounding code can be compiled.
For efficiency's sake, control-stack:get shall return *the*
current-thread stack, which shall be an ordinary epsilon buffer (at this
time it's not), without making a copy of it. Of course the result may
then be copied, if needed.
New epsilon0 language form:
One or more parameters: an epsilon0 procedure, and its actuals;
As many results as the procedure.
Formal semantics: identical to e0:call as per Chapter 2.
Rationale: for efficiency reasons, in compiled code as many temporary
values as possible as kept in machine registers rather than on the
stack; in a sophisticated implementation this may also true for some
temporaries belonging to caller frames. This complication is hidden
from the semantics but is very real (at least, I will take advantage of
this in the future). Reality may actually be even more complicated:
some memory-management systems, including most copying collectors,
enforce some representation constraints on values: it's what I call
"boxedness tags" in Chapter 3. Now, it would be very nice to be able to
temporarily break these constraints in an implementation, and use a more
efficient representation in some places where efficiency matters (again,
I'd like to take advantage of this in the future).
Implementation note: e0:call-with-saved-temporaries shall:
- evaluate its actual parameters into new temporaries;
- save all temporaries which are currently in registers to the stack,
adding boxedness tags where absent if required by the implementation;
- call the procedure;
- at procedure return, restore still-alive temporaries from the stack
into registers, removing again boxedness tags where required;
- return results.
This design permits an efficient implementation, while still providing
the user with a clean view of the stack when needed.
Simply writing into the buffer with buffer:set! updates the saved
values. It should be done within the extent of the procedure called by
e0:call-with-saved-temporaries, in order to ensure that temporaries held
in registers are also affected.
Of course this is highly unsafe in general. But as usual, I reply that
epsilon0 is not intended for humans, and epsilon1 is not intended for
At this point we can add stack introspection as well, which is one of
Basile's main concerns: just knowing which stack frame belongs to which
procedure is not a terribly delicate issue. It can be done with one
Two parameters: a stack, a stack height;
one result: a vector holding <program point, initial frame height,
final frame height> triples.
Somewhere in the global state, a compiler-generated table shall map each
program point into a procedure name, a source locus and an alist mapping
(at least) each alive local variable name into its frame offset.
control-stack:introspect might either be a trivial wrapper for a
procedure with the same semantics, or it might be built on top of some
lower-level primitives, in any way convenient for an efficient
implementation. (For example Basile knows how the ocamlopt runtime uses
a global hash mapping each function return address saved on the stack to
something similar to what I call program point).
If I recall correctly, Basile proposed the minor stack change above
(I've just made explicit the required design changes, as I see them)
plus call stack introspection. The primitive semantics change is no big
deal. On the other hand epsilon0 is a very minimalistic language so I'm
not very happy at the idea of adding a new form to it; however this
really seems necessary. So be it.
> Also, being able to update the call stack, including by inserting new
> (or removing existing) call frames in the middle of it, would be very
> interesting, requires deep support from the compiler (and the language)
> and would enable an interesting implementation of continuations and
> call/cc. Current Epsilon1's approach (whole program CPS transformation)
> is not really funny (and not very original neither).
However, I responded to Basile, if we go to the trouble of letting the
user explicitly write on the stack, I'd really like to do more: let's
have the user update call frames as well. This is another way of
implementing continuations, different from a transform, possibly simpler
and more efficient. It can also be used for coroutines, and green
threads/fibers, which is useful. Or simply exceptions --- the one most
important language feature missing from epsilon1, in my opinion ---
again in an efficient manner. Let's call this hypothetical "the major
If it was Basile who first came out with the major stack change idea
instead of me, I apologize to him once more; again, I've been pondering
on this for quite some time, and the origin of each idea now is a little
muddled in my head.
Let me draft a design of the major stack change.
Pros: the ability of writing non-local control features, even in epsilon0;
All the pros of the minor stack change.
Cons: a *big* semantic change in epsilon0, plus the one from the minor
This entails a further change in the semantics of primitives: now
primitives should also be able to change the current stack: in the new
primitive semantics, the stack becomes both a parameter and a result.
Of course this new updated stack coming out of a primitive shall become
the stack of the primitive caller. This change has profound
implications on the semantics: once a procedure call terminates, it's
not necessarily the case any more that control goes back to its caller.
For example the proofs (and the results) in Chapter 4 break.
I'd have only one primitive perform destructive stack changes:
Two parameters: the new stack, its height;
Zero results [or possibly one boolean result, à-la-setjmp?].
control-stack:set! replaces the stack with the given one, pushed its
zero- [or one-?] bundle result, and goes on from there.
When this primitive is called we just forget about its caller: in
general it will change into something completely different, starting
from the execution of the last part of e0:call-with-saved-temporaries.
Then, I suppose, at some point it would execute the last part of
e0:call-with-saved-temporaries, who reified the stack in the first
> Defining the good
> primitives (and probably a few more elementary control flow operations
> for call stack update, à la setcontext(3) in some Linux; see
> http://man7.org/linux/man-pages/man3/getcontext.3.html ...) to be able
> to push and pull a segment of the call stack would be much more
> challenging, and would enable quite interesting implementations of
> call/cc (and other non-local control flow).
Indeed. I don't claim to understand all the implications of
control-stack:set!, or how to use it in general. Just like call/cc, I'm
sure that its details are deeper than they look. Simple in a way, but
I wouldn't add explicit chained call frames: that's inefficient when
you're not actually performing nonlocal jumps. However, sure, we can
keep the stack short by copying it and replacing it when needed. At a
low level, I'd go with simply control-stack:get and control-stack:set!.
> Few languages have the required features to implement garbage collection
> and call/cc by call stack introspection and modification, and I really
> believe that Epsilon0 should have them (and having them will give
> Epsilon0 a decisive advantage...).
Let me make it clear (to the other people reading) that user-defined
garbage collection only depends on the minor stack change; it doesn't
even require stack introspection.
I find the minor stack change quite uncontroversial, but the major one
still troubles me. The central idea in epsilon is to be able to
implement language features on top of a very simple and
easy-to-understand model. The current epsilon0 does qualify as easy
and epsilon0 plus the minor stack change does as well, because
procedures still have a very uniform, clear behavior:
In case of tail calls, just replace a call call..return return sequence
with goto goto... Still trivially simple to understand and implement.
Allowing each expression to replace the stack anyway feels a little like
having call/cc in the core language, which is one of the things I
criticize about Scheme: a too complex foundation.
My CPS transform was not really usable, but something more complex along
those lines, if possible and still efficient, would make me happier than
the major stack change. I was thinking of compiling each procedure
twice, one in direct form and one in CPS; and to keep them both
available, using the direct (faster) version whenever it can be proved
that the current code doesn't escape.
Isn't any form of CPS overkill just to implement exceptions? Probably.
I'm certainly bothered by the idea of depending on the garbage collector
An alternative is what Xavier Leroy calls Exception-Returning Style
epsilon bundles (which on the other hand don't interact very well with
CPS) permit to implement ERS without heap allocations, which is good,
but performing a full ERS transform on all the code would still be
inefficient. Having ERS and CPS one on top of another to implement
continuations (which I also want sometimes) is probably unacceptable
from a performance point of view. Having them side by side, compiling
each procedure three times, would work better, even if it would make it
more difficult to have one form of code call another; they all have
incompatible calling conventions.
So, on one hand, the major stack change looks like a just slightly more
ambitious version of the minor change, and they should go hand in hand;
on the other, the major change modifies the nature of the epsilon0 in a
radical way, which makes me hesitate. But the more I think of it, the
more I suspect that doing everything with transforms is not really
feasible. I'd like to experiment with transforms some more, because
I've played relatively little with them, before completely giving up --
Transforms are still useful of course, but much less than I thought when
dealing with control forms.
In the end, as I write this, I understand that you've convinced me.
I will work on this after finishing the s-expression frontend and
removing the Guile dependency. Then I think I'll make a release (you
were even suggesting to call it 1.0; OK, I can, but I will not guarantee
any backward compatibility), and then work on the stack changes. I
think it will not even be hard to implement.
Thanks a lot for your feedback,
Home page: http://ageinghacker.net
GNU epsilon: http://www.gnu.org/software/epsilon