Re: GC mark stack

emacs-devel

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: GC mark stack

From:	Lynn Winebarger
Subject:	Re: GC mark stack
Date:	Mon, 15 Aug 2022 21:53:27 -0400

On Mon, Aug 15, 2022 at 2:17 PM Stefan Monnier <monnier@iro.umontreal.ca> wrote:
>
> > If the loop can be expressed by a relational expression consisting of
> > a concrete relation (like "references"), composition, union, and
> > transitive closure, then there is a systematic way to construct an
> > algorithm of the type I described.
>
> I think the core of the problem is the following:
>
> In a "key weak" hash table, if a reachable object points (transitively) to
> an object that is a key in that table, then the value associated with
> that key becomes reachable.  So conceptually we have a reference (an
> edge) from the key *object* to the value, but this reference is not
> actually represented in the heap.

Right, so the weak part(s) of the hash table entry essentially invert
the strong reference relation graph - I wasn't clear on this part.  As
you surmised, explicit construction of the inverted relation (i.e.
reifying the back-reference) is the most straightforward method for
creating the graph to traverse with Tarjan's algorithm when inversion
is added to the set of operators allowed in the relational
expressions.  I'm not 100% on the semantics of weak hash table
reference types, so can you verify:
*  Key -  key strongly-references entry, entry strongly-references
value   [ asserting the existence of 2 edges in the graph of the
strong-references relation ]
*  Value - value strongly-references entry, entry strongly-references key
*  KeyOrValue - ( key strongly-references entry, entry
strongly-references value ) OR ( value strongly-references entry,
entry strongly-references key ) - maybe equivalent to introducing
intermediate entries for each case:
    *  (key strongly-references key-entry, key-entry
strongly-references entry, key-entry strongly-references value),
(value strongly-references value-entry, value-entry
strongly-references entry, value-entry strongly-reference key)
*  KeyAndValue - (key, value) strongly-references entry, where (key,
value) is a special node that is only live if both key and value are
(or entry itself is such node)

I'm really unsure about those last two.  I'm also not sure what the
right logic is if the hash function is for equal or eqv and not eq.
Or, that matter, whether a fixnum or float as an eq key should always
or never keep an entry live.
I would also be concerned about adding costs to code that does not use
weak references at all.  The best I think you could do is add a simple
bit flag to indicate whether an object has been used as the (or a)
target of the weak part of a weak hash table entry, or the target of a
weak reference in general.  Then at least it's only a quick check in
mark_object in general, with more involved lookup into the reified
references if it has been set.
I think there's probably room for such a bit in all types that would
make sense as weak keys.  The complexity is probably prohibitive for
any savings, I don't know.

I did a brief survey of the tree in correspondence with Mattias
Engdegård for appearances of weak hash tables in the "core":

    * all_loaded_comp_units_h in src/comp.c
    * composition_hash_table in src/composite.c - comment is out of
date, says it's weak, but then when actually created is not
    * advertised-signature-table in lisp/emacs-lisp/byte-run.el - used
in byte compiler to detect obsolete calling convention
    * cl--generic-combined-method-memoization in lisp/emacs-lisp/cl-generic.el
    * eieio--object-names in lisp/emacs-lisp/eieio.el
    * ert--test-buffers in lisp/emacs-lisp/ert-x.el
    * macro-exp--warned in lisp/emacs-lisp/macroexp.el
    * read-answer-map--memoize in lisp/emacs-lisp/map-ynp.el
    * pcase--memoize in lisp/emacs-lisp/pcase.el
    * undo-equiv-table in lisp/simple.el
    * add-to-ordered-list in lisp/subr.el uses a local variable
ordering bound to a weak hash table

What would be interesting on the first one is if, after purifying
strings before dumping, the weak references from NCU values kept the
filename strings used as keys alive, and those were also (possibly
weak) keys in some other hash table.  I have no idea if that's the
case, it would just be a weird, probably unintended side effect of
pure space.  I'm not sure how many of the others are long-lived or
potentially large enough to be a concern (with 2100+ loaded NCUs, the
first one is neither short-lived nor particularly negligible in size)
I'm not sure how you can determine whether it's an issue in practice.

Lynn

[Prev in Thread]

Current Thread

[Next in Thread]

Re: GC mark stack, Stefan Monnier, 2022/08/15
- Re: GC mark stack, Lynn Winebarger, 2022/08/15
  - Re: GC mark stack, Stefan Monnier, 2022/08/15
    - Re: GC mark stack, Lynn Winebarger, 2022/08/15
    - Re: GC mark stack, Stefan Monnier, 2022/08/15
    - Re: GC mark stack, Lynn Winebarger <=
    - Re: GC mark stack, Lynn Winebarger, 2022/08/16

Prev by Date: Re: New ELPA package ada-lite
Next by Date: Making elp.el behave like a real profiler
Previous by thread: Re: GC mark stack
Next by thread: Re: GC mark stack
Index(es):
- Date
- Thread