[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Preview: portable dumper

From: Daniel Colascione
Subject: Preview: portable dumper
Date: Mon, 28 Nov 2016 11:50:31 -0800
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.4.0

I've been working on a portable dumper for GNU Emacs. The attached patch is an early version of this work. I'll write up all the usual NEWS entries, changelogs, and (in this case necessary) dedicated documentation before I land it. I want to start getting review comments now that the code has roughly its final shape.

The point of this gargantuan patch is that we can rip out our unexec implementations and replace them with loading a data file that contains an Emacs heap image. There are no dependencies on executable rewriting, disabling ASLR, or saving and restoring internal malloc state. This system works with fully position-independent executables and with any malloc implementation.

Basically, there's a new dump-emacs-portable function that walks the Emacs heap and writes the data, along with necessary relocation, to a file called emacs.pdmp. On startup, early in main, we find emacs.pdmp and load it. Once we've loaded the file, we walk the list of relocations contained in the dump and adjust it to account for the runtime locations of Emacs and the dump data (neither of which we know in advance in a PIE world.)

There are a few subtleties: I've carefully designed the file format to be mmap-able and to minimize the number of on-demand copies the system makes while accessing this file. For example, we stick bool-vectors and string data at the end of the dump in a contiguous block. We follow this block with the relocations, which we can throw away as soon as we've used them.

An additional optimization follows, although this part isn't implemented yet: we can define a "preferred load address" for the dump and write relocation information such that if the dump and Emacs end up being where we expect them to be, we don't have to perform any relocations at all.

The system gracefully degrades though. If we can't use mmap or whatever on a given platform, it's possible to just slurp the whole file into a malloced region of memory and access it from there. This approach would benefit from compression, which will reduce IO loads: LZ4 reduces the dump size for me from ~12MB to ~4MB. As in the mmap case, we can throw away

The code isn't even close to optimized yet --- I've only tested it at -O0, defined GC_CHECK_MARKED_OBJECTS, and not yet inlined frequently-called functions pdumper.h --- but even so, it's within 100ms or so of an unexeced Emacs.

It's also possible to dump an already-dumped Emacs, so it should be possible for users to have their own dump files.

If we want to preserve the current model of a single "emacs" executable that contains itself, we can embed emacs.pdmp inside the emacs executable data section pretty easily. There's no behavior change involved.

If you want to try this code, build CANNOT_DUMP=1, run ./temacs -l loadup pdump, then ./emacs (or if that doesn't work, ./emacs --dump-file=emacs.pdmp).

Attachment: pdumper.diff
Description: Text Data

reply via email to

[Prev in Thread] Current Thread [Next in Thread]