emacs-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Suppressing native compilation (short and long term)


From: Liliana Marie Prikler
Subject: Re: Suppressing native compilation (short and long term)
Date: Fri, 14 Oct 2022 21:46:09 +0200
User-agent: Evolution 3.46.0

Am Donnerstag, dem 13.10.2022 um 23:22 +0300 schrieb Eli Zaretskii:
> > From: Liliana Marie Prikler <liliana.prikler@gmail.com>
> > Cc: rlb@defaultvalue.org, emacs-devel@gnu.org
> > Date: Thu, 13 Oct 2022 21:20:13 +0200
> > 
> > > When you encounter bugs in native compilation, please report them
> > > to us, so we could fix them.  As of now, we are not aware of any
> > > such bugs that were reported and haven't been fixed.  So if you
> > > still have such problem, please report them ASAP.
> > Of course, that's the intention, but this fix will only make it
> > into the next Emacs release.  Thus, if you're between releases, you
> > still need a workaround.
> 
> If the fix is urgent, why can't you patch the sources when you
> prepare your distribution?
Guix prides itself in being a package manager that can work around many
failures (even as the proper workaround to bugs is discussed in mailing
lists).  The fact, that the solutions to this issue is "compile 28.1
without native-comp" or "use Emacs 27" does not reflect that
particularly well.

> > A particular candidate known to cause issues with the currently
> > packaged 28.1 is [1].
> 
> Where's the description of the actual problem with natively compiling
> that package?  And would you please submit a bug report with the
> details, if you know them?
I am not personally affected, so I can't.  I could direct people to the
Emacs mailing lists, but it seems people in other threads have already
started debugging.  Do you still wish me to do so? 

> > > Why isn't it sufficient to use no-native-compile?  It just means
> > > that on some architectures the corresponding file will be loaded
> > > as byte-compiled, and thus will be slightly slower (how much
> > > slower depends on the code, so if you are worried, my
> > > recommendation is first to measure the difference -- you might be
> > > surprised).
> > Because it'd require a distro-wide fix to address something that
> > e.g. only happens on some AMD CPUs.
> 
> I'm asking why doing so is a problem?  Did you measure the effect on
> performance and found it to be unacceptable in some cases?
Isn't performance one of the main reasons to use native compilation? 
Note that I am talking in hypotheticals here when mentioning the AMD
thing, i.e. we could very well imagine a performance-critical Emacs
package having a native-compilation bug (I imagine those to be
particularly likely for those trailing unreleased Emacs versions,
though thankfully I don't think we've encountered one so far.)

> > > > In GNU Guix, we default to not compiling packages ahead-of-
> > > > time, instead using a minimal emacs that can only do byte
> > > > compilation for most packages.  Users can however pretty easily
> > > > switch to ahead-of-time compilation by swapping out the emacs
> > > > package (using what Guix calls package transformations).  This
> > > > is also important because apart from the current Emacs we
> > > > typically provide an emacs-next package for upcoming versions,
> > > > as well as some other variants that the user might want to use
> > > > instead.  Again, we assume that users who want to opt in to
> > > > native compilation do so via a transformation.
> > > 
> > > Sorry, I'm unfamiliar with this terminology.  When you say
> > > "ahead-of-time compilation", do you mean native compilation of
> > > all the Lisp files before they are loaded for the first time?  Or
> > > do you mean something else?
> > Exactly, it's more or less the same as ahead-of-time compilation
> > via package.el, which Emacs supports out of the box.
> > 
> > > And what is "swapping out the emacs package", and what is
> > > "package transformation"?
> > Guix users can decide between an Emacs package that only does byte-
> > compilation (emacs-minimal, the default) or native compilation
> > (emacs).
> > They can easily write this by using the command-line switch --with-
> > input=emacs-minimal=emacs while building their Emacs package (e.g.
> > emacs-dash for dash.el).
> 
> OK, so why is this relevant to the issue of disabling?  Those who
> choose ahead-of-time compilation will never see async JIT
> compilation, and those who selected not to do ahead-of-time will
> naturally see JIT compilation, as they've chosen.  What is the
> problem here?
The problem is that I can't meaningfully choose the "I don't want JIT
for stuff I haven't AOT'd" option, especially not combined with "but I
do want to load what I have AOT'd".

> > > > In this context, the default of compiling everything that has
> > > > hitherto only been byte-compiled is an ill-advised choice. 
> > > > Now, there is a chance that the user meant to do this anyway,
> > > > but I think they rather a) use whatever the distro default is
> > > > without caring either way or b) actually didn't want this
> > > > package natively compiled.  Due to some packages breaking, we
> > > > had a lot of b) in our mailing lists in the past few days.
> > > 
> > > If a package is a single file or a small number of files, those
> > > users can add the no-native-compile cookies in those files.
> > This is not trivial in the case where the Elisp code is placed in
> > system-managed storage and thus requires elevated privileges to
> > modify (as is the default in most package managers, I assume).  Of
> > course, you can copy the file to your $HOME, but editing it with a
> > broken Emacs is rather painful.
> 
> Using broken packages is always painful, and native compilation
> doesn't change that.
Using broken packages normally doesn't result in the OOM killer firing
off.

> Packages provided by a distribution and installed into directories
> where users cannot easily write should be well tested by the
> distributor.  
I think you're underestimating the number of breakages that can happen
in a rolling release model.  Not every distro is as stable as Debian,
but the joke's still on you because despite Debian's hard requirements,
they still ended up encountering this bug.

> IOW, I don't understand why the upstream project should be held
> responsible for packages distributed by downstream distributions
> without testing them well enough to let users use them without pain,
> and why should the upstream project introduce options to support such
> "broken" distros.  It isn't fair to expect us to solve such problems,
> because they should be solved elsewhere.
There are limits to testing.  When I added native comp to Guix, I gave
folks in the mailing lists a heads up to try their own configuration
and report bugs.  But people don't always read the mailing lists and
therefore aren't aware of upcoming changes that may break their system,
and I personally can't test every Emacs configuration in existence.

> > > And again, disabling native compilation is a method that doesn't
> > > allow them fine-tuning anyway, so I fail to see how it could help
> > > them here.  If the problem is real (and I don't yet see it is),
> > > we should perhaps discuss its details and invent some new method
> > > of disabling compilation with finer granularity.
> > The granularity here is disabling compilation of anything that
> > isn't already compiled – this makes it so that people can still use
> > their Emacs for byte compilation by invoking it with "emacs -Q",
> > they just won't compile anything that their package manager hasn't
> > compiled for them.
> 
> That's quite a blunt weapon.  Why not tell them to stay with Emacs
> 27, until the problems are solved, or install Emacs 28 without native
> compilation?
That's what they're currently resorting to.  Guix already supports this
use case.

> They are similarly drastic solutions, but they are already available
> and will definitely work.
Yet they are suboptimal, particularly on traditional distributions,
that don't support this use-case well.

> > > I don't think you can set native-comp-eln-load-path to nil.  The
> > > last entry, the one which points to where the preloaded *.eln
> > > files live, must be there, or else Emacs will refuse to start. 
> > > And at least one other directory should be there as well, because
> > > if and when some package advises some primitive, Emacs will need
> > > to generate and compile a trampoline .eln file.  But yes, if
> > > users want to prevent loading from a certain directory, they can
> > > remove it from native-comp-eln-load-path, provided that the two
> > > necessary entries are still left in the list.
> > I already found this annoying while implementing native compilation
> > as part of our emacs-build-system (i.e. the build system used to
> > compile Emacs packages), and I find it extra questionable that
> > users on traditional distros, where they don't usually get to
> > choose their Emacs, have no means of disabling this loading.
> 
> You mean, you find the loading of preloaded *.eln files at startup
> annoying?  Then you should know that this is the best solution we
> found for dumping Emacs with natively-compiled preloaded code.
No, I find it annoying that Emacs supposes it has a writable eln-cache
always.  This is not the case in typical package manager scenarios and
it also isn't the case when users choose to make (parts of) their $HOME
read-only, which is a supported configuration in Nix and Guix.  I can't
think of a good reason why one would want to assume this invariant.

> If you know of a better solution that doesn't suffer from any fatal
> issues we found with the alternatives, please suggest such solutions,
> and we will definitely consider them.
I haven't read the discussions around the alternatives, but couldn't
you just generate one trampoline per function which you use as soon as
it's advised?

Also, how come advice isn't breaking byte-compilation in exactly the
same manner?

> And again, if Emacs with natively-compilation is annoying, by all
> means offer your users an Emacs built without natively-compilation.
> This is supported and will continue to be supported for the
> observable future.
> 
> > Calling back to the earlier point on measuring performance, an easy
> > way to measure performance between bytecode and native code (in a
> > benchmark) would be to simply disable native code loading in one
> > process.  But here, it requires two separate builds of Emacs.
> 
> As I told earlier, disabling loading of native code made no sense to
> us while Emacs 28 was in development; it still doesn't.  Either one
> wants native-compilation, or one doesn't.  Making Emacs code more
> complicated and harder to maintain due to features that make no sense
> to us is a non-starter.  I see no problem with having to use a
> separate build, since building a release tarball takes a minute or so
> on a modern system.  And distros should definitely have a build
> without native-compilation on offer, for a variety of valid reasons.
I don't think that asking distros to package every Emacs variant twice
is a great idea.  At Guix, we prefer to offer the most complete version
of a package, so we ship with native compilation enabled.  If a user
wants a UI, but no native compilation (i.e. neither emacs nor emacs-
minimal), they have to roll their own package description.
> 

> > While bytecode performance on such machines might too be slow (but
> > perhaps tolerable for the task), ahead-of-time compilation, perhaps
> > with offloading, is preferable.
> 
> I recommend against this, because it is impossible to rely on AOT
> installations to never compile at run time.  Users cannot rely on
> that, and should be advised accordingly.
But why can't they?

> > For another, it can cause bugs like [2].
> 
> That bug by itself (the cause of massive launching of async
> subprocesses) was never explored or described in that thread?  It
> seems like the discussion switched to looking for ways of disabling
> native-compilation right away, without a good understand of what was
> happening.  Or did I mis something?  Async compilation by default
> never launches more subprocesses than half the execution units of the
> CPU, so what is described there should be carefully investigated and
> the findings described.
It'd be weird if someone found a counterexample to the above statement.

> The other problem in that discussion, with warnings during async JIT
> compilation is well-known, was reported several times, and the
> culprit is always in the 3rd-part packages being compiled, which
> should be fixed.  In any case, those are just warnings in almost all
> cases, so their only adverse effect is annoyance (that can be
> suppressed by clicking the button in the message).
I read no such problem in that discussion.  Do we read the same thread?

> Again, I see no reason to blame the upstream project for these
> issues.  They should be solved by the offending 3rd-party packages,
> and the distro should ideally uncover and fix them before they get to
> users (I presume that you build and compile the add-on packages you
> offer?).
I'd like to tap at the "rolling release distro is not Debian" sign, but
again, stable distros like Debian are experiencing issues with native
compilation.

> > > Thanks for the explanations.  I still think the reasons for
> > > disabling native compilation are rather weak at best, and the
> > > users' requests to allow that based on either bugs that need to
> > > be solved or surprise and fears that have no real basis. 
> > > Moreover, disabling native compilation is a very blunt instrument
> > > that cannot be applied better than the existing ones, like no-
> > > native-compile (and a few others that we didn't mention; see the
> > > defcustom's in comp.el).
> > Which defcustom?
> 
> Begin with those described in the ELisp manual, in the
> "Native-Compilation Variables" node.  And my recommendation is to
> review _all_ of the defcustoms in comp.el
The only one I found is setting native-comp-speed to -1.  Is that the
solution?  It doesn't appear to be.

> > I fear that for all of its customizability, Emacs is
> > currently lacking a good way of disabling native compilation
> > outside of package management.
> 
> Yes, because, as mentioned, this makes no sense.
I think you'll find it does.

> And disabling native-compilation completely is currently technically
> impossible (for the same reason: Emacs wasn't designed to support
> that because it didn't and still doesn't seem needed).
> 
> > I also tried setting no-native-compile globally, but it seems to
> > only have an effect as a file-local variable.
> 
> Yes, as designed.  This variable is the equivalent of no-byte-
> compile, and works very similarly.
> 
> To summarize: native compilation in a build which supports it is
> ubiquitous, and is not designed to be disabled except by
> no-native-compile on a file by file level.  If a more general
> disabling is needed for some reason, users should simply use a build
> without native-compilation.  It's the same as various toolkit builds:
> if the toolkit is broken or doesn't fit the user's needs, those users
> should install a build with a different toolkit.
Pardon my French, but that thinking in and of itself is broken.  Native
compilation is not a choice in which you pick the one that most suits
your fancy from a range of options – it could be that if you allowed
the user to choose between libgccjit, clang and some other compilers
that shall not be named, not that I recommend you implement this.  As
such, I think users who do want to use native compilation should get
some more say in when, where, and what to compile.

Cheers



reply via email to

[Prev in Thread] Current Thread [Next in Thread]