[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: native compilation units

From: Lynn Winebarger
Subject: Re: native compilation units
Date: Sun, 19 Jun 2022 13:52:35 -0400

On Wed, Jun 15, 2022 at 8:23 AM Stefan Monnier <monnier@iro.umontreal.ca> wrote:
> There is one kind of _expression_ where Andrea isn't quite correct, and that
> is with respect to (eval-when-compile ...).

You don't need `eval-when-compile`.  It's already "not quite correct"
for lambda expressions.  What he meant is that the function associated
with a symbol can be changed in every moment.  But if you call
a function without going through such a globally-mutable indirection the
problem vanishes.

I'm not sure what the point here is.  If all programs were written with every variable 
and function name lexically bound, then there wouldn't be an issue.
After Andrea's response to my original question, I was curious if the kind of 
semantic object that an ELF shared-object file *is* can be captured (directly) in the
semantic model of emacs lisp, including the fact that some symbols in ELF are bound
to truly immutable constants at runtime by the loader.   Also, if someone were to
rewrite some of the primitives now in C in Lisp and rely on the compiler for their use,
would there be a way to write them with the same semantics they have now (not 
referencing the run-time bindings of other primitives).
Based on what I've observed in this thread, I think the answer is either yes or almost yes.
The one sticking point is that there is no construct for retaining the compile-time environment.
If I "link" files by concatenating the source together, it's not an issue, but I can't replicate
that with the results the byte-compiler currently produces.
What would also be useful is some analogue to Common Lisp's package construct, but extended
so that symbols could be imported from compile-time environments as immutable bindings.
Now, that would be a change in the current semantics of symbols, unquestionably, but
not one that would break the existing code base.  It would only come into play compiling
a file as a library, with semantics along the lines of:
  (namespace <name of library obstack>)
  <library code> ...
   (export <symbol> ...)
Currently compiling a top-level _expression_ wrapped in eval-when-compile by itself leaves 
no residue in the compiled  output, but I would want to make the above evaluate
to an object at run-time where the exported symbols in the obstack are immutable.
Since no existing code uses the above constructs - because they are not currently defined -
 it would only be an extension.

I don't want to restart the namespace debates - I'm not suggesting anything to do
with the reader parsing symbol names spaces from prefixes in the symbol name.
>> It's also "modulo enough work on the compiler (and potentially some
>> primitive functions) to make the code fast".
> Absolutely, it just doesn't look to me like a very big lift compared to,
> say, what Andrea did.

It very depends on the specifics, but it's definitely not obviously true.
ELisp like Python has grown around a "slow language" so its code is
structured in such a way that most of the time the majority of the code
that's executed is actually not ELisp but C, over which the native
compiler has no impact.

That's why I said "look[s] to me", and inquired here before proceeding.
Having looked more closely, it appears the most obvious safe approach,
that doesn't require any ability to manipulate the C call stack, is to introduce
another manually managed call stack as is done for the specpdl stack, but
malloced (I haven't reviewed that implementation closely enough to tell if it
is stack or heap allocated).  That does complicate matters.
That part would be for allowing calls to (and returns from) arbitrary points in
byte-code (or native-code) instruction arrays.  This would in turn enable
implementing proper tail recursion as "goto with arguments".

These changes would be one way to address the items in the TODO file for 
28.1, starting at line 173:
* Important features
** Speed up Elisp execution [...]
*** Speed up function calls [..]
** Add an "indirect goto" byte-code [...]
*** Compile efficiently local recursive functions [...]

As for the other elements - introducing additional registers to facilitate
efficient lexical closures and namespaces - it still doesn't look like a huge lift
to introduce them into the bytecode interpreter, although there is still the work
to make effective use of them in the output of the compilers.

I have been thinking that some additional reader syntax for what might be
called "meta-evaluation quasiquotation" (better name welcome) could be useful.
I haven't worked out the details yet, though. I would make #, and #,@ effectively
be  shorthand for eval-when-compile.  Using #` inside eval-when-compile should
produce an _expression_ that, after compilation, would provide the meta-quoted
_expression_ with the semantics it would have outside an eval-when-compile

> Does this mean the native compiled code can only produce closures in
> byte-code form?

Not directly, no.  But currently that's the case, yes.

> below with shared structure (the '(5)], but I don't see anything in
> the printed text to indicate it if read back in.

You need to print with `print-circle` bound to t, like the compiler does
when writing to a `.elc` file.

I feel silly again. I've *used* emacs for years, but have (mostly) avoided using 
emacs lisp for programming because of the default dynamic scoping and the 
implications that has for the efficiency of lexical closures.  

> I'm sure you're correct in terms of the current code base.  But isn't
> the history of these kinds of improvements in compilers for functional
> languages that coding styles that had been avoided in the past can be
> adopted and produce faster code than the original?

Right, but it's usually a slow co-evolution.
I don't think I've suggested anything else.  I don't think my proposed changes to the byte-code
VM would change the semantics of emacs LISP, just the semantics of the byte-code
VM.  Which you've already stated do not dictate the semantics of emacs LISP.
> In this case, it would be enabling the pervasive use of recursion and
> less reliance on side-effects.

Not everyone would agree that "pervasive use of recursion" is an improvement.
True, but it's still a lisp - no one is required to write code in any particular style.   It would
be peculiar (these days, anyway) to expect a lisp compiler to optimize imperative-style code
more effectively than code employing recursion.
> Improvements in the gc wouldn't hurt, either.

Actually, nowadays lots of benchmarks are already bumping into the GC as
the main bottleneck.

I'm not familiar with emacs's profiling facilities.  Is it possible to tell how much of the 
allocated space/time spent in gc is due to the constant vectors of lexical closures?  In particular,
how much of the constant vectors are copied elements independent of the lexical environment?
That would provide some measure of any gc-related benefit that *might* be gained from using an
explicit environment register for closures, instead of embedding it in the byte-code vector.


reply via email to

[Prev in Thread] Current Thread [Next in Thread]