[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: pdumper's performance
From: |
Daniel Colascione |
Subject: |
Re: pdumper's performance |
Date: |
Wed, 29 Aug 2018 22:19:27 -0700 |
User-agent: |
SquirrelMail/1.4.23 [SVN] |
> Thanks Daniel for your prompt response. I have some further questions,
> tho.
>
>> You can see for yourself whether there's an impact. Compile an Emacs
>> with
>> support for both pdumper and unexec, dump it with unexec, and compare
>> its
>> GC performance to Emacs built without support for pdumper and also
>> dumped
>> with unexec.
>
> I hoping to save myself the time ;-)
> [ BTW, part of the reason for those questions is that I'm writing an
> article about the history of Elisp, and I'd like to understand how
> your code works so I can say something intelligent about it.
> Oh and there's not much time left before the deadline.
Cool.
> Another part of course, is that I'd like to see this feature land
> on master. ]
Me too. ;-)
>
>> As I recall, the difference is minimal.
>
> Do you recall the tests you used and the ballpark of the difference?
Exactly the above. IIRC, the difference amounted to a millisecond or two
on an emacs -Q startup plus an immediate (garbage-collect) --- but that's
without the no-relocation optimization below.
>>> Also I don't quite understand why this is needed: IIUC the markbits of
>>> pdump'd objects are stored elsewhere, but I don't understand why that
>>> needs to be the case.
>> Because we don't store dumped objects in blocks and so the calculations
>> of
>> the normal locations of their mark bits would be wrong.
>
> Hmm... OK that could explain it for conses and floats where we keep the
> markbits separately from the objects in bitmaps alongside those blocs,
> but you also have those <foo>_marked_p and set_<foo>_marked functions for
> all other types of objects where the markbit is normally stored within
> the object itself (i.e. it doesn't matter whether they're in blocks or
> not).
>
> Why did you choose to use a completely different layout for the objects
> loaded from the dump?
The objects themselves have the same layout that they do in the normal
heap. (The layout of a cons cell is unchanged, for example.) Dumping
objects individually instead of in blocks both simplifies the
implementation and allows for a more compact dump, as you point out below.
> I naively thought your code would take
> cons_blocks, symbol_blocks, ... and write those blocks as-is so objects
> keep the same layout, and things like mark_maybe_object don't need to be
> changed at all. I understand this would end up writing larger dumps
> (since they would include some free objects), but I'd have expected it
> would lead to simpler code and a smaller patch.
If we keep the mark bits out of the objects, we can avoid modifying the
object pages just for GC. In the non-PIC case, in which in principle we
don't have to relocate the dump, that means that the pages in the dump
stay clean and file-backed, not dirty, COWed, and pagefile-backed as they
would if we banged on them just for the GC. That's an efficiency win.
For a future more-efficient GC, contiguous object storage with external
mark bits is probably the way to go for the entire heap.