Re: native compilation units

On Wed, Jun 15, 2022 at 8:23 AM Stefan Monnier <monnier@iro.umontreal.ca> wrote:

> There is one kind of _expression_ where Andrea isn't quite correct, and that
> is with respect to (eval-when-compile ...).

You don't need `eval-when-compile`. It's already "not quite correct"
for lambda expressions. What he meant is that the function associated
with a symbol can be changed in every moment. But if you call
a function without going through such a globally-mutable indirection the
problem vanishes.

I'm not sure what the point here is. If all programs were written with every variable

and function name lexically bound, then there wouldn't be an issue.

After Andrea's response to my original question, I was curious if the kind of

semantic object that an ELF shared-object file *is* can be captured (directly) in the

semantic model of emacs lisp, including the fact that some symbols in ELF are bound

to truly immutable constants at runtime by the loader. Also, if someone were to

rewrite some of the primitives now in C in Lisp and rely on the compiler for their use,

would there be a way to write them with the same semantics they have now (not

referencing the run-time bindings of other primitives).

Based on what I've observed in this thread, I think the answer is either yes or almost yes.

The one sticking point is that there is no construct for retaining the compile-time environment.

If I "link" files by concatenating the source together, it's not an issue, but I can't replicate

that with the results the byte-compiler currently produces.

What would also be useful is some analogue to Common Lisp's package construct, but extended

so that symbols could be imported from compile-time environments as immutable bindings.

Now, that would be a change in the current semantics of symbols, unquestionably, but

not one that would break the existing code base. It would only come into play compiling

a file as a library, with semantics along the lines of:

(eval-when-compile

(namespace <name of library obstack>)

<library code> ...

(export <symbol> ...)

)

Currently compiling a top-level _expression_ wrapped in eval-when-compile by itself leaves

no residue in the compiled output, but I would want to make the above evaluate

to an object at run-time where the exported symbols in the obstack are immutable.

Since no existing code uses the above constructs - because they are not currently defined -

it would only be an extension.

I don't want to restart the namespace debates - I'm not suggesting anything to do

with the reader parsing symbol names spaces from prefixes in the symbol name.

>> It's also "modulo enough work on the compiler (and potentially some
>> primitive functions) to make the code fast".
> Absolutely, it just doesn't look to me like a very big lift compared to,
> say, what Andrea did.

It very depends on the specifics, but it's definitely not obviously true.
ELisp like Python has grown around a "slow language" so its code is
structured in such a way that most of the time the majority of the code
that's executed is actually not ELisp but C, over which the native
compiler has no impact.

That's why I said "look[s] to me", and inquired here before proceeding.

Having looked more closely, it appears the most obvious safe approach,

that doesn't require any ability to manipulate the C call stack, is to introduce

another manually managed call stack as is done for the specpdl stack, but

malloced (I haven't reviewed that implementation closely enough to tell if it

is stack or heap allocated). That does complicate matters.

That part would be for allowing calls to (and returns from) arbitrary points in

byte-code (or native-code) instruction arrays. This would in turn enable

implementing proper tail recursion as "goto with arguments".

These changes would be one way to address the items in the TODO file for

28.1, starting at line 173:

* Important features
** Speed up Elisp execution [...]
*** Speed up function calls [..]
** Add an "indirect goto" byte-code [...]
*** Compile efficiently local recursive functions [...]

As for the other elements - introducing additional registers to facilitate

efficient lexical closures and namespaces - it still doesn't look like a huge lift

to introduce them into the bytecode interpreter, although there is still the work

to make effective use of them in the output of the compilers.

I have been thinking that some additional reader syntax for what might be

called "meta-evaluation quasiquotation" (better name welcome) could be useful.

I haven't worked out the details yet, though. I would make #, and #,@ effectively

be shorthand for eval-when-compile. Using #` inside eval-when-compile should

produce an _expression_ that, after compilation, would provide the meta-quoted

_expression_ with the semantics it would have outside an eval-when-compile

form.

> Does this mean the native compiled code can only produce closures in
> byte-code form?

Not directly, no. But currently that's the case, yes.

> below with shared structure (the '(5)], but I don't see anything in
> the printed text to indicate it if read back in.

You need to print with `print-circle` bound to t, like the compiler does
when writing to a `.elc` file.

I feel silly again. I've *used* emacs for years, but have (mostly) avoided using

emacs lisp for programming because of the default dynamic scoping and the

implications that has for the efficiency of lexical closures.

> I'm sure you're correct in terms of the current code base. But isn't
> the history of these kinds of improvements in compilers for functional
> languages that coding styles that had been avoided in the past can be
> adopted and produce faster code than the original?

Right, but it's usually a slow co-evolution.

I don't think I've suggested anything else. I don't think my proposed changes to the byte-code

VM would change the semantics of emacs LISP, just the semantics of the byte-code

VM. Which you've already stated do not dictate the semantics of emacs LISP.

> In this case, it would be enabling the pervasive use of recursion and
> less reliance on side-effects.

Not everyone would agree that "pervasive use of recursion" is an improvement.

True, but it's still a lisp - no one is required to write code in any particular style. It would

be peculiar (these days, anyway) to expect a lisp compiler to optimize imperative-style code

more effectively than code employing recursion.

> Improvements in the gc wouldn't hurt, either.

Actually, nowadays lots of benchmarks are already bumping into the GC as
the main bottleneck.

I'm not familiar with emacs's profiling facilities. Is it possible to tell how much of the

allocated space/time spent in gc is due to the constant vectors of lexical closures? In particular,

how much of the constant vectors are copied elements independent of the lexical environment?

That would provide some measure of any gc-related benefit that *might* be gained from using an

explicit environment register for closures, instead of embedding it in the byte-code vector.

Lynn

From:	Lynn Winebarger
Subject:	Re: native compilation units
Date:	Sun, 19 Jun 2022 13:52:35 -0400