|
From: | Daniel Colascione |
Subject: | Re: Time to drop the pre-dump phase in the build? |
Date: | Fri, 10 Jan 2014 21:30:13 -0800 |
User-agent: | Mozilla/5.0 (X11; Linux x86_64; rv:24.0) Gecko/20100101 Thunderbird/24.2.0 |
On 01/10/2014 09:13 PM, Stefan Monnier wrote:
Another possibility is to just allocate enough space in the emacs image itself in BSS, then replace that mapping with a view of the dump file.Indeed, that should work, assuming you can mmap into existing space.
On POSIX-y systems, you can just mmap on top of the existing section. On Windows, you have to unmap first, but I think it could be made to work.
But not nearly as bad: the main dump problem we have is with generating the `emacs' executable, whereas here we'd only need to generate the "swap file" which is later loaded into the same executable. Should still be a lot more portable.
Do you mean building emacs with a large blob of zero in .data, using it as a heap, and replacing the contents of that section (without modifying the executable image structure) to actually "dump" emacs?
By the way: is it me, or are we dirtying far too much of the current emacs image? On my Emacs, we're dirtying (and COWing) 8MB; if I make Fgarbage_collect a no-op, that drops to 4MB.For sure, GC will dirty up pretty much all pages that hold Lisp objects (except for those in the purespace), because of the need to set/reset the `mark' bit.
I was thinking about this problem. What if we were to just treat all image-backed objects as already marked if they're in pages that are unmodified? (We can perform this test very cheaply, at least on */Linux and Windows.) Then we wouldn't mark them during GC, and we additionally don't demand-page objects just for GC.
The problem we create is that we might have modified image-backed objects reachable only from unmodified image-backed objects, and these modified objects might point to heap-allocated objects that we really should mark. So what if we walk the per-type allocation lists during the *mark* phase and treat all in-image objects on modified pages as individual roots? This way, we eventually mark all heap-allocated objects. (Let's assume that no image-backed unmodified object can directly point to a heap-allocated object.)
This way, we can avoid touching most dumped data structures during GC. We might modify them for other reasons, though, like setting symbol value cells --- but if my quick and dirty GC test worked correctly, we should still save quite a bit on commit charge without worrying about these cases.
[Prev in Thread] | Current Thread | [Next in Thread] |