[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Opportunistic GC

From: Pip Cet
Subject: Re: Opportunistic GC
Date: Mon, 8 Mar 2021 10:44:06 +0000

On Mon, Mar 8, 2021 at 9:51 AM martin rudalics <rudalics@gmx.at> wrote:
>  > We fork(), we don't clone(). If A exists at the time of fork() but
>  > isn't reachable through the thread's stack, that's a bug anyway; if it
>  > doesn't exist yet at the time of fork(), it's not collected in this
>  > cycle.
> I have no idea what the difference between fork and clone implies here.

fork() is a (relatively) cheap copy-on-write "copy" of the address space.

clone() would cause severe problems, of the type you describe above.
fork() is the (very, computers are expensive) poor man's way to avoid
those problems.

> IIUC you have to make a copy of Lisp thread and heap, have the collector

A "copy", as above.

> operate on the copies and, when done, pass the now unmarked objects to
> the Lisp thread for recycling in the original heap.  IIUC that doubles

Yes, that part is correct.

> the size needed for heap and stack.

Not quite. It's a copy-on-write copy, and we would avoid unsharing the
pages we don't set mark bits in. Pages that get collected don't have
set mark bits, so they wouldn't get copied.

And, yes, we should keep the mark bits separate from the data so we
could avoid unsharing an entire page because a single object in it
survives GC.

If the stack is of significant size (it usually isn't), it won't get
copied, either.

> And short-lived objects have to
> wait for the next cycle to get recovered.

I'm not aware of any GC algorithm that recovers objects allocated
after the GC cycle started :-)

> Or what am I missing?

Mostly that there should not be a doubling of memory usage. With the
mark bits fixed, memory usage would grow by, at most, 1/64th the heap
size on 64-bit systems, plus a constant. I think we can live with a
1.5% increase in memory usage if we get (effectively) zero-cost (or
zero latency) GC in exchange. (I'm assuming there'll be an unused CPU
core, which is usually true for me (unless I'm compiling something)).
With the mark bits not fixed, memory usage would increase by
approximately the size of the surviving heap, but not the size of the
discarded heap pages.

In addition to keeping the mark bits discontiguous from the marked
object, we should free entire pages without, as we do at present,
first faulting them in. That's part of the reason Emacs does so badly
when the system starts swapping.

Note that none of this is "real" GC: we still mark and sweep, just in
a slightly smarter way.


reply via email to

[Prev in Thread] Current Thread [Next in Thread]