[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: guile and emacs: unexec
Re: guile and emacs: unexec
Sun, 14 Jun 2009 01:21:37 -0400
On Jun 13, 2009, at 09:06, Andy Wingo wrote:
On Fri 12 Jun 2009 07:02, Ken Raeburn <address@hidden> writes:
I'm glad to see the emacs-lisp work is progressing. As it happens, a
month or so ago I blew some of the dust off my old guile-emacs
and started working on it again too. This flavor of emacs+guile work
aimed to replace Lisp objects in Emacs with Guile objects at the
level (numbers, cons cells; symbols and such become smobs) and then
work upwards from there.
Very interesting! To be clear -- the goal would be to represent as
of Emacs using cheap Guile structures as possible: numbers and cons
cells and such, and represent specific Emacs objects as smobs? That's
probably a good idea.
Yes -- for now, that includes anything I haven't converted, including
strings, symbols, vectors of objects, hash tables, etc. Many of what
are currently smobs should eventually be converted to using Guile's
versions, either directly or with some simple wrapping. They need to
stay identifiable in Lisp as the correct object types, so I can't
implement a Lisp string with, say, a Guile list containing a string
plus text-property data. In the long term perhaps some of them could
be implemented partly or fully in Scheme, but I don't want to diverge
radically while I still need to track the main Emacs code base.
(Please let me keep the illusion that replacing the fundamental object
representation, allocator and garbage collector, and compensating for
initialization problems throughout the code, isn't all that radical a
But I'm not worrying about that much right now -- if the
representation abstraction is complete and correct, the existing Emacs
code should be able to pull out all the right data from the smobs, and
results should be indistinguishable. Well, except that integer and
floating ranges may be different, hash table ordering changes --
simple, reasonable, and well-understood differences. It's not quite
I figure, once I've got this set of changes working correctly (i.e.,
nearly indistinguishable, no random unexplained errors or differences
in behavior), then I can tackle the next steps with more confidence
that differences observed there are due to the new changes in
progress, not semantic differences previously introduced with further-
reaching effects than I expected.
It's also kind of appealing to have something at intermediate stages
that I might be able to show off, and say "hey, this works well enough
that you can try it out; want to help me on the next steps?" (And
since I'm getting into all this now, I *would* like some help. I was
just intending to fix a few more problems before making the plea. :-)
I'm specifically *not* trying to do some of the other things that have
been discussed but aren't about running Emacs -- make buffers
independent objects that can be used outside of Emacs, stuff like
that. That can come later (or not), and I'd be glad to see it happen,
but getting Emacs running at all is a big enough project for me on my
Symbols however should probably be represented as Guile symbols, not
smobs. I think that you will find that with a more compilation-centric
approach, we will be able to keep more simple datatypes, as we compile
the procedures that operate on those data types to appropriate code.
Eventually, yes, I think so. They should probably be one of the next
things to change, though some like vectors and strings might be
simpler. I'm also concerned about the performance impact of making
such a switch; another reason for getting something working soon is so
it's practical to look at performance questions.
I've updated to recent Emacs sources and Guile 1.8.6. I've gotten it
to a point where it seems to start up fine in tty mode, reads in (and
does color highlighting of) C files and directories, does some other
basic stuff. I'm tweaking it now to see if I can get more stuff
working (like Cocoa support and "make bootstrap") and do more
Very neat! That's fantastic that you were able to get it this far, I
didn't know that was possible.
I actually had it pretty far along once or twice before (I seem to
keep reviving this every few years, and spend a lot of time updating
to newer code bases), but I think I've managed to push it a bit
further than I had it earlier. With just me working on it, depending
on the demands of my job, there tend to be large periods when no
progress gets made, and it doesn't keep up with the upstream sources;
the prospect of having to do a bunch of catch-up work just makes it
that much less appealing to get back into it. It's been moving
forward in spurts for over a decade now, very slowly. :-(
If this is an effort that you want to pay off in the future, though, I
would strongly suggest updating to the 1.9/2.0 series of Guile. The
expressive range of Guile's multilingual facilities is much higher
there, and significantly different from 1.8.
I was looking at updating, but ran into the -I ordering problem I
reported. Since that's fixed, I'll try again sometime.
The multilingual facilities aren't very important to me right now --
like I said above, I'm mostly just switching some object
representations now, and I'm still using the Emacs code for any
multilingual stuff. Eventually that should change, but what I want of
Guile right now is a nice, simple byte array I can stick string data
into. :-) Emacs 23 is going to go out with the Emacs version of the
support, and yanking out anything made available to Lisp programmers
isn't going to go over very well. Of course, it wouldn't be very good
to wind up with duplicated work, or redundant or conflicting
OTOH, the emacs lisp support is not yet up to the level that it is
1.8, so perhaps now is not yet the time.
And, I haven't started using any of that code yet, either... that's
another big change to try at some point when everything else is
looking solid. And, I assume it expects the use of Guile symbols and
Guile strings at least? In order to make this switch, too, the
semantics really have to match Emacs Lisp -- stuff like indirect
symbols, buffer- or frame-local bindings, etc. And all the Emacs C
code needs to know how to look up values (or function values, or
property lists, or whatever) when given Guile symbols. And then
there's the lexical binding branch work, which I haven't even looked
One really big hiccup I've run into, which I've sort of sidestepped
the moment: Guile is not unexec-friendly.
There is a way to build Emacs so it doesn't use unexec, but it then
to load a lot of Lisp code at run time, really killing the startup
performance, and I don't think it's tested all that much (e.g., "make
bootstrap" doesn't work even without the Guile hacks). To really
this project work, I need to be able to link against Guile (static is
fine, and probably necessary), do a bunch of Lisp/Scheme processing,
write out a memory image into a new executable, and later be able to
run that executable.
It's true that Guile doesn't do unexec currently. It might in the
future -- obviously it will if you implement it of course ;)
But I would ask that you reconsider your approach to making Guile-
load quickly. There is no a priori reason that loading Lisp code
be slow. With Guile-compiled elisp, loading a file is just mapping it
into memory -- the same as you have with an image. The loaded code
to be run to establish definitions, but that is a very quick
I don't think the current Lisp reader is all that slow, but it has to
load and run quite a bit of stuff, especially with the
internationalization support. Especially during a "bootstrap"
operation, when most of the stuff it loads is uncompiled Lisp source
It seems to me that switching to Guile-compiled elisp for startup
would require, well, basically most of the remaining work of my
project, including switching to the Guile-based Lisp reader and
evaluator, wouldn't it? So we're looking at some non-trivial changes
here. They're desirable changes, in the long run, but taking this
route would mean no efficient startup of guile-emacs any time soon,
which in turn slows down the development cycle. The unexec support
may be useless once we get there, but right now it's a much shorter
path to something useable I can show off.
(Fixing up the "interactive scheme mode" that talks to Guile directly
would be nice to show off, too. My current one is kind of a lame hack.)
I agree that heap saving could be slightly faster. But I think that
Emacs should be able to load from bytecode within 100 ms or so /with
current Guile-VM code/ -- and even faster if we do native ahead-of-
compilation at some point.
I'd certainly like to get there eventually.
Really, it comes down to wanting something I can make work now,
instead of a project with minimal, uninteresting intermediate results
that may or may not pay off in another decade or so, and doesn't get
anyone else interested in helping out. With the current state of
Emacs, that means unexec is kind of needed. It can sort of work
without it, but not well -- and that's true of the upstream Emacs code
base too, but no one on that side cares very much because unexec works
for them everywhere.
I've got some political concerns here too. There has also been some
resistance, when this project has come up on the Emacs lists, to
switching away from the current Lisp evaluator for any reason, even if
Guile support is added (it's not broken, major changes involve
significant risk, don't see the benefit, etc); there's also been
support, but it's contentious. So my rather vague plan has involved
putting off even addressing that possible switch until I can show
clear advantages and no blatant drawbacks (like performance, or
correctness, or handling of out-of-memory conditions) to using Guile.
I'd rather not discuss it from a position of weakness and uncertainty;
better to have working code we can experiment with and numbers we can
point to. (But first, let's experiment and generate numbers
ourselves, and see if we need to fix bugs.) Then we can discuss our
I don't know how much chance there would be for getting it ready in
time for Emacs 24, but with enough help, I think Emacs 25 should be
doable; possibly even 24, who knows?
Any record of current threads needs to go away, and be replaced with
info on the new one-and-only thread in the new process; I'm building
without thread support for now to get around it. Any record of stack
regions to be scanned for SCM objects likewise needs resetting.
Allocated objects must *not* go away, and must continue to be
by the garbage collector, so I can't just reinitialize everything.
Assigned smob types must remain in effect, and for now I'm
possibility that some smobs may need some kind of reinitialization.
Mutexes... well, I don't know if they need reinitializing; POSIX is
kind of unclear on interactions with unexec. :-) I expect
reinitializing them is probably safe, even if not required in some
This could be complicated if we merge in the BDW-GC branch, to use
libgc. Note that SCM does have unexec, IIRC, we could steal parts of
That might work, yes... or if not, it sounds like I'd be stuck with
using an old Guile, or getting the CANNOT_DUMP option working and
suffering with the slow startup.
(And, this reminds me -- there are still some likely GC-related bugs
with scm_leave_guile/scm_enter_guile that should be fixed up. I got
them removed from the API years ago, but they're still used internally
in threads.c, down below the comment with my old email explaining the
doom they may bring upon us. Does BDW-GC scrap that code finally?
Is this something that could be useful to anyone outside of Emacs?
Unexec certainly could, to deliver self-contained binaries. But TBH I
think the booting-from-compiled-files option is more maintainable. In
any case this would be a neat hack. Have fun! :)
I agree, compiled files would work better, but I doubt we can push the
Emacs folks to move in that direction first. They're happy with
unexec for now.
P.S. If anyone wants to take a look at my current work,
has a snapshot from tonight.
Cool! Have you considered using git, and branching from Emacs' git
mirror? That way it is trivial to set up something other people can
comment on, in easily-digestible patch chunks.
Yep, but I need to get proficient with it first, and haven't put in
the time yet; until then I'm using subversion in a rather clumsy
fashion (often just checkpointing untested merges, and my Emacs
sources have the CVS admin files checked in so I can update easily).
If it's something other people want to actually work on, on the other
hand, we could set up something via sourceforge or savannah or
whatever. But only if there's actually going to be additional help