Ken Raeburn <address@hidden
> schrieb am Mo., 29. Mai 2017 um 11:33 Uhr:
Ken Raeburn <address@hidden
> schrieb am So., 28. Mai 2017 um 13:07 Uhr:
On May 21, 2017, at 04:53, Paul Eggert <address@hidden> wrote:
> Ken Raeburn wrote:
>> The Guile project has taken this idea pretty far; they’re generating ELF object files with a few special sections for Guile objects, using the standard DWARF sections for debug information, etc. While it has a certain appeal (making C modules and Lisp files look much more similar, maybe being able to link Lisp and C together into one executable image, letting GDB understand some of your data), switching to a machine-specific format would be a pretty drastic change, when we can currently share the files across machines.
> Although it does indeed sound like a big change, I don't see why it would prevent us from sharing the files across machines. Emacs can use standard ELF and DWARF format on any platform if Emacs is doing the loading. And there should be some software-engineering benefit in using the same format that Guile uses.
Sorry for the delay in responding.
The ELF format has header fields indicating the word size, endianness, machine architecture (though there’s a value for “none”), and OS ABI. Some fields vary in size or order depending on whether the 32-bit or 64-bit format is in use. Some other format details (e.g., relocation types, interpretation of certain ranges of values in some fields) are architecture- or OS-dependent; we might not care about many of those details, but relocations are likely needed if we want to play linking games or use DWARF.
I think Guile is using whatever the native word size and architecture are. If we do that for Emacs, they’re not portable between platforms. Currently it works for me to put my Lisp files, both source and compiled, into ~/elisp and use them from different kinds of machines if my home directory is NFS-mounted.
We could instead pick fixed values (say, architecture “none”, little-endian, 32-bit), but then there’s no guarantee that we could use any of the usual GNU tools on them without a bunch of work, or that we’d ever be able to use non-GNU tools to treat them as object files. Then again, we couldn’t expect to do the latter portably anyway, since some of the platforms don’t even use ELF.
Is there any significant advantage of using ELF, or could this just use one of the standard binary serialization formats (protobuf, flatbuffer, ...)?
That’s an interesting idea. If one of the popular serialization libraries is compatibly licensed, easy to use, and performs well, it may be better than rolling our own.
I've tried this out (with flatbuffers), but I haven't seen significant speed improvements. It might very well be the case that during loading the reader is already fast enough (e.g. for ELC files it doesn't do any decoding), and it's the evaluator that's too slow.
What’s your test case, and how are you measuring the performance?
In my tests with the one big elc file, using the Linux “perf” tool, it seems that readchar, read1, encode_char, and ungetc are where a good chunk of CPU time is still spent — about 1/4 in my testing with the “big elc file” code. My experiment in May cut down a chunk of the overall run time (start in batch mode, print a message, and exit) with some ugly reader syntax hacks. Tests with smaller files may have different characteristics though…