Re: native compilation units

On Sun, Jun 19, 2022 at 7:02 PM Stefan Monnier <monnier@iro.umontreal.ca> wrote:

> Currently compiling a top-level _expression_ wrapped in
> eval-when-compile by itself leaves no residue in the compiled output,

`eval-when-compile` has 2 effects:

1- Run the code within the compiler's process.
E.g. (eval-when-compile (require 'cl-lib)).
This is somewhat comparable to loading a gcc plugin during
a compilation: it affects the GCC process itself, rather than the
code it emits.

2- It replaces the (eval-when-compile ...) thingy with the value
returned by the evaluation of this code. So you can do (defvar
my-str (eval-when-compile (concat "foo" "bar"))) and you know that
the concatenation will be done during compilation.

> but I would want to make the above evaluate to an object at run-time
> where the exported symbols in the obstack are immutable.

Then it wouldn't be called `eval-when-compile` because it would do
something quite different from what `eval-when-compile` does :-)

The informal semantics of "eval-when-compile" from the elisp info file are that

This form marks BODY to be evaluated at compile time but not when

the compiled program is loaded. The result of evaluation by the
compiler becomes a constant which appears in the compiled program.
If you load the source file, rather than compiling it, BODY is
evaluated normally.

I'm not sure what I have proposed that would be inconsistent with "the result of evaluation

by the compiler becomes a constant which appears in the compiled program".

The exact form of that appearance in the compiled program is not specified.

For example, the byte-compile of (eval-when-compile (cl-labels ((f...) (g ...)))

currently produces a byte-code vector in which f and g are byte-code vectors with

shared structure. However, that representation is only one choice.

It is inconsistent with the semantics of *symbols* as they currently stand, as I have already admitted.

Even there, you could advance a model where it is not inconsistent. For example,

if you view the binding of symbol to value as having two components - the binding and the cell

holding the mutable value during the extent of the symbol as a global/dynamically scoped variable,

then having the binding of the symbol to the final value of the cell before the dynamic extent of the variable

terminates would be consistent. That's not how it's currently implemented, because there is no way to

express the final compile-time environment as a value after compilation has completed with the

current semantics.

The part that's incompatible with current semantics of symbols is importing that symbol as

an immutable symbolic reference. Not really a "variable" reference, but as a binding

of a symbol to a value in the run-time namespace (or package in CL terminology, although

CL did not allow any way to specify what I'm suggesting either, as far as I know).

However, that would capture the semantics of ELF shared objects with the text and ro_data

segments loaded into memory that is in fact immutable for a userspace program.

> byte-code (or native-code) instruction arrays. This would in turn enable
> implementing proper tail recursion as "goto with arguments".

Proper tail recursion elimination would require changing the *normal*
function call protocol. I suspect you're thinking of a smaller-scale

version of it specifically tailored to self-recursion, kind of like
what `named-let` provides. Note that such ad-hoc TCO tends to hit the same
semantic issues as the -O3 optimization of the native compiler.
E.g. in code like the following:

(defun vc-foo-register (file)
(when (some-hint-is-true)
(load "vc-foo")
(vc-foo-register file)))

the final call to `vc-foo-register` is in tail position but is not
a self call because loading `vc-foo` is expected to redefine
`vc-foo-register` with the real implementation.

I'm only talking about the steps that are required to allow the compiler to

produce code that implements proper tail recursion.

With the abstract machine currently implemented by the byte-code VM,

the "call[n]" instructions will always be needed to call out according to

the C calling conventions.

The call[-absolute/relative] or [goto-absolute] instructions I suggested

*would be* used in the "normal" function-call protocol in place of the current

funcall dispatch, at least to functions defined in lisp.

This is necessary but not sufficient for proper tail recursion.

To actually get proper tail recursion requires the compiler to use the instructions

for implementing the appropriate function call protocol, especially if

"goto-absolute" is the instruction provided for changing the PC register.

Other instructions would have to be issued to manage the stack frame

explicitly if that were the route taken. Or, a more CISCish call-absolute

type of instruction could be used that would perform that stack frame

management implicitly.

EIther way, it's the compiler that has to determine whether a return

instruction following a control transfer can be safely eliminated or not.

If the "goto-absolute" instruction were used, the compiler would

have to decide whether the address following the "goto-absolute"

should be pushed in a new frame, or if it can be "pre-emptively

garbage collected" at compile time because it's a tail call.

> I'm not familiar with emacs's profiling facilities. Is it possible to
> tell how much of the allocated space/time spent in gc is due to the
> constant vectors of lexical closures? In particular, how much of the
> constant vectors are copied elements independent of the lexical
> environment? That would provide some measure of any gc-related
> benefit that *might* be gained from using an explicit environment
> register for closures, instead of embedding it in the
> byte-code vector.

No, I can't think of any profiling tool we currently have that can help
with that, sorry :-(

Note that when support for native closures is added to the native
compiler, it will hopefully not be using this clunky representation
where capture vars are mixed in with the vector of constants, so that
might be a more promising direction (may be able to skip the step where
we need to change the bytecode).

The trick is to make the implementation of the abstract machine by each of the

compilers have enough in common to support calling one from the other.

The extensions I've suggested for the byte-code VM and lisp semantics

are intended to support that interoperation, so the semantics of the byte-code

implementation won't unnecessarily constrain the semantics of the native-code

implementation.

Lynn

From:	Lynn Winebarger
Subject:	Re: native compilation units
Date:	Sun, 19 Jun 2022 21:39:28 -0400