[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: On elisp running native

From: Andrea Corallo
Subject: Re: On elisp running native
Date: Thu, 28 Nov 2019 22:52:47 +0000
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/26.2 (berkeley-unix)

Stefan Monnier <address@hidden> writes:

>>> [ I wasn't able to follow all the explanations at
>>>   http://akrl.sdf.org/gccemacs.html, such as the one around "function
>>>   frames", with which I'm not familiar.  Are these like activation
>>>   frames? ]
>> Yes I think we both mean the same.  In this case basically where we store
>> automatic variables and data related to the single activated function.
> OK, thanks.  I think "activation frame" is the standard term (of which
> there can be several on the stack at the same time in case of recursion).

OK good to know, I don't have an official education as a compiler engineer
so my nomenclature can be occasionally a bit off.

>>> - How did you get there?  I see some "we" in the web page, which makes
>>>   it sound like you weren't completely alone.
>> Sorry for that I'm not much into using 'I'.
> That's OK, but I see you tried to use it as a clever ploy to dodge the
> initial question: how did you get there?

Sorry I though the question was more to understand if there were other
contributors for copyright reasons.

On my side the long story short is about this:

I was already into gcc and libgccjit.  I thought was cool to apply these
to some lisp implementation.  I decided to have Emacs as a target cause
I imagined would have been useful and because I'm obviously an Emacs
user and fan.

I wanted to do something with the potential to be completed and
up streamed one day.  Therefore I discarded the idea of writing the full
lisp front-end from scratch.  On the other side I considered the idea
seen in previous projects of reusing the byte-compiler infrastructure
quite clever.

The original plan was the to do something like Tromey's jitter but gcc
based and with a mechanism to reload the compiled code.  So I did it.

I had a single pass compiler all written in C that was decoding the byte
code and driving libgccjit.

I was quite unhappy with that solution for two reasons:

1- The C file was getting huge without doing anything really smart.

2- After some test and observation became clear that to generate
efficient code this approach was quite limited and a more sophisticated
approach with a propagation engine and the classical compiler theory
data structures was needed.  The idea of just driving gcc and having
everything magically optimized was simply naive.

So I came up with the idea of defining LIMPLE and using it as interface
between the C and the lisp side of the compiler.

In this way I had the right IR for implementing the 'clever' algorithmic
into lisp and the C side has just to 'replay' the result on libgccjit.
Moreover it saved me from the pain of exposing libgccjit to lisp.

I then realized I could, instead of decoding op-codes, just spill the LAP
from the byte-compiler.  This makes the system simpler and more robust
cause I get also information on the stack depth I can double check or
use during limplification.

Lastly I managed to reuse the information defined in the byte-compiler
on the stack offset of every op to generate automatically or semi the
code of my compiler for the translation from LAP to LIMPLE for good part
of the op codes.

The rest just iterating over tests debugging and implementing.

I'm not sure this answer the question. Does it?

>>> - Have you tried to use the compiler as benchmark (i.e. how much faster
>>>   can Emacs compile (either byte-compile or native-compile)) if the
>>>   compiler code is native-compiled (since it's all using
>>>   lexical-binding already)?
>> I use the compiler native compiled but because of the previous point I
>> think is hard to measure the difference.
> How 'bout measuring the time to byte-compile a given set of files, then:
> first using the byte-compiled compiler and then using the
> native-compiled compiler (where "compiler" here means at least cconv.el,
> byte-opt.el, bytecomp.el, and macroexp.el)?

I think is a test we can do, sounds like a very good benchmark.

> BTW, I think developing a good set of Elisp benchmarks is useful
> independently from this, so I'd encourage you to submit your benchmarks
> as a new GNU ELPA package (we could also incorporate it into Emacs
> itself, but I think we'll want to use it to compare performance between
> diverse Emacsen, so a separate package makes more sense).

OK I'll do some clean-up then.  BTW I think for dhrystone my colleague
has to do the paper-works and we have to look also into the original

> Maybe someone from the Gnus side will want to submit more benchmarks
> (such as one that manipulates "sets/ranges or article numbers").
>> Talking about compile time in general I think we are looking at
>> something like few minutes to compile the whole Emacs at speed 0.  The
>> time goes up to say ~4 hours with 4 cores for the same job at speed 2.
> [ Compile time varies for me with the normal Emacs from less than
>   5 minutes to more than an hour depending on the machine on which
>   I compile, so absolute times don't speak to me very much.  ]
> So, IIUC, with enough optimisations enabled, we gets into "a long
> time" territory?

Yes feels like compiling C++ :x

>> I think it will be interesting to look into the gcc compilation pipe to
>> see where we are losing so much time, my guess is that there's one or
>> few passes that go a little nuts with all the moves we do.  I had no
>> time to look into it but my guess is that once understood the problem we
>> can probably dime it down.
> Indeed, I'm surprised that compilation time in gcc would blow up by
> significantly more than a factor 10 just because of optimisation
> options, so either we're using optimisations which are really too
> costly, or there should be something we can do to avoid this blow up
> without any significant performance loss.

Me too.  My guess is that because the code we feed into gcc does not
look at all like it's written by a human something has just to be tuned
somewhere in some pass.  I'm looking forward to investigate it but I'm a
bit saturated now.

I'm quite confident we can at least mitigate it but in general maybe we
should give also the possibility to specify different optimization
levels per function granularity.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]