emacs-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Wrap-up of customize native-compiled dump


From: Lynn Winebarger
Subject: Wrap-up of customize native-compiled dump
Date: Sun, 21 Aug 2022 13:31:42 -0400

A while ago I was asking for assistance in making a "mega-dump" with a
large sample of native-compiled packages preloaded using the emacs
28.1 source tarball.  I've had the dump working for a week or so on a
Linux system based on a 3.10 kernel. I was asked if I could share the
results of my effort, and while I can't share the patches required, I
can describe the results if it makes any difference for current
developers considering the value of this feature.  My "final" build
has a little over 2400 native compilation units in the dump.  The dump
file is ~180MB when I left out lsp-mode libraries, and ~190MB with
lsp-mode included

First, the positives:
*  It's very fast to startup and use.  I have Semantic/EDE turned on
and mostly do not have any noticeable effect on responsiveness.
Without lsp-mode, startup takes a couple of seconds or less,with
lsp-mode (for most supported languages) it takes a few more seconds.
*  Garbage collection is surprisingly fast for a heap size starting at
~190MB.  I haven't run any benchmark for this, but I did have one
library that was constantly calling "eval" to determine the tabs for
the header-line when semantic took over the header line.  That
generated a lot of garbage, so it was pausing frequently.  However,
the pauses themselves seemed very short - much shorter than I
associate with having an emacs session that has built-up a 190+MB heap
from the standard dump.  I can only speculate that having everything
native-compiled means there are relatively few byte-code vectors that
have to be traversed in the heap, and that most of the garbage from
the loading and initialization is effectively collected by the
portable dump process.  Once I fixed that setting that called "eval"
in the problem library, which was just constructing a call to a thunk,
to just call the thunk directly, the pauses became much less frequent,
and still not very noticeable.
*  At the end of site-load, I have a call to load the full
customization edit system to force the calculation of customization
groups ahead of time instead of calculating them lazily.  For a dump
with over 1000 non-core packages, that lazy computation after startup
is very slow and each group seems to only be computed when it is first
opened - I'm assuming autoload is involved.  Having those groups
precalculated in the dump appears to work well, for the most part,
customize buffers open quickly, and the ordering of the groups and
setting is at least consistent every time instead of depending on the
order of library autoloading.
*  Icons etc are included in the dump.
*  Whatever the process limits opening shared libraries is, I did not
encounter them.

IOW, the infrastructure for native-compiled files in 28.1 already
provides very efficient inclusion of those files in dumps, and there
appears to be a virtuous-cycle effect with respect to garbage
collection.  I wouldn't mind having my experience verified (or
disproven) , though.

The cons - basically the work required to get the system dumped:
*  I submitted a list of issues I had to deal with to be able to
perform any customized dump using site-load.el at
https://debbugs.gnu.org/cgi/bugreport.cgi?bug=57035 .  This list
excludes issues related to including  any particular library in that
dump, even if they are in the core
*  I massaged about 1200 ELPA/MELPA packages so that the library code
was under site-lisp in the source directory, and any data files were
put in the <top source directory>/site-data/<package-name>.  Most of
that process was automated, but there were a nontrivial number of
cases that required special attention to ensure any package variable
set to reference a data directory relative to the file being loaded or
byte-compiled was redirected.
*  A lot of packages - including some core libraries - have issues
with variables that expect to be initialized when loaded rather than
at system initialization.  Most of these can be dealt with using
:initialize #'custom-initialize-delay and :set-after '(<vars>).  I'm
not sure if the ":set-after" option would have been sufficient when
the dependencies had the delayed initialization setting, but it
usually worked that way
*  A few variables  from the core libraries still caused problems for
customizations initialized based on them.  I haven't reported them as
bugs because custom dumping with native-compiled libraries isn't
really supported as far as I know, but they'll cause issues when it is
(list from memory):
    ** user-emacs-directory (issue in starting the customization
system, I believe)
    ** user-mail-address  (regular variable from startup.el, so
perhaps out of the late initialization framework?)
    ** image-load-path (the issue is from ezimage generating images at
load time rather than initialization, so the late-initialization
framework for customization variables doesn't apply)
*  There may well be some libraries that aren't working and I don't
know it.  One point of the preloading is to make features of these
packages more "discoverable", but it's difficult to tell if there are
conflicting libraries in the system a priori.
*  There's no great way of updating pre-loaded packages one at a time.
I did arrange all the pre-compiled packages to be the "preloaded"
subdirectory, so putting newer versions in the main native-lisp system
cache might be a viable way to install updates.
*  The dump process is *slow* (no benefit from having many cores) and
any getting errors toward the end of a list of 1000-2000 libraries to
load is a long wait - like 20+ minutes on the system I did this on.
*   Normal process loading is fast even with 2400+ eln files, but
loading that dump under gdb will be a long wait.  I don't know how
long, as I gave up after letting it run for an  hour or two and built
a dump with just the autoload files and the problem modules.

I find the result worth my effort in putting in "the last mile"of
infrastructure.

Lynn



reply via email to

[Prev in Thread] Current Thread [Next in Thread]