Re: GC mark stack

emacs-devel

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: GC mark stack

From:	Lynn Winebarger
Subject:	Re: GC mark stack
Date:	Tue, 16 Aug 2022 05:50:34 -0400

On Mon, Aug 15, 2022 at 9:53 PM Lynn Winebarger <owinebar@gmail.com> wrote:
>
> On Mon, Aug 15, 2022 at 2:17 PM Stefan Monnier <monnier@iro.umontreal.ca> 
> wrote:
> >
> > > If the loop can be expressed by a relational expression consisting of
> > > a concrete relation (like "references"), composition, union, and
> > > transitive closure, then there is a systematic way to construct an
> > > algorithm of the type I described.
> >
> > I think the core of the problem is the following:
> >
> > In a "key weak" hash table, if a reachable object points (transitively) to
> > an object that is a key in that table, then the value associated with
> > that key becomes reachable.  So conceptually we have a reference (an
> > edge) from the key *object* to the value, but this reference is not
> > actually represented in the heap.
>
> Right, so the weak part(s) of the hash table entry essentially invert
> the strong reference relation graph - I wasn't clear on this part.  As
> you surmised, explicit construction of the inverted relation (i.e.
> reifying the back-reference) is the most straightforward method for
> creating the graph to traverse with Tarjan's algorithm when inversion
> is added to the set of operators allowed in the relational
> expressions.  I'm not 100% on the semantics of weak hash table
> reference types, so can you verify:
> *  Key -  key strongly-references entry, entry strongly-references
> value   [ asserting the existence of 2 edges in the graph of the
> strong-references relation ]
> *  Value - value strongly-references entry, entry strongly-references key
> *  KeyOrValue - ( key strongly-references entry, entry
> strongly-references value ) OR ( value strongly-references entry,
> entry strongly-references key ) - maybe equivalent to introducing
> intermediate entries for each case:
>     *  (key strongly-references key-entry, key-entry
> strongly-references entry, key-entry strongly-references value),
> (value strongly-references value-entry, value-entry
> strongly-references entry, value-entry strongly-reference key)
> *  KeyAndValue - (key, value) strongly-references entry, where (key,
> value) is a special node that is only live if both key and value are
> (or entry itself is such node)
>
   Also every entry strongly references the hash table containing it,
if that's not obvious.

> I'm really unsure about those last two.  I'm also not sure what the
> right logic is if the hash function is for equal or eqv and not eq.
> Or, that matter, whether a fixnum or float as an eq key should always
> or never keep an entry live.
> I would also be concerned about adding costs to code that does not use
> weak references at all.  The best I think you could do is add a simple
> bit flag to indicate whether an object has been used as the (or a)
> target of the weak part of a weak hash table entry, or the target of a
> weak reference in general.  Then at least it's only a quick check in
> mark_object in general, with more involved lookup into the reified
> references if it has been set.
> I think there's probably room for such a bit in all types that would
> make sense as weak keys.  The complexity is probably prohibitive for
> any savings, I don't know.
>
> I did a brief survey of the tree in correspondence with Mattias
> Engdegård for appearances of weak hash tables in the "core":
>
>     * all_loaded_comp_units_h in src/comp.c
>     * composition_hash_table in src/composite.c - comment is out of
> date, says it's weak, but then when actually created is not
>     * advertised-signature-table in lisp/emacs-lisp/byte-run.el - used
> in byte compiler to detect obsolete calling convention
>     * cl--generic-combined-method-memoization in lisp/emacs-lisp/cl-generic.el
>     * eieio--object-names in lisp/emacs-lisp/eieio.el
>     * ert--test-buffers in lisp/emacs-lisp/ert-x.el
>     * macro-exp--warned in lisp/emacs-lisp/macroexp.el
>     * read-answer-map--memoize in lisp/emacs-lisp/map-ynp.el
>     * pcase--memoize in lisp/emacs-lisp/pcase.el
>     * undo-equiv-table in lisp/simple.el
>     * add-to-ordered-list in lisp/subr.el uses a local variable
> ordering bound to a weak hash table
>
> What would be interesting on the first one is if, after purifying
> strings before dumping, the weak references from NCU values kept the
> filename strings used as keys alive, and those were also (possibly
> weak) keys in some other hash table.  I have no idea if that's the
> case, it would just be a weird, probably unintended side effect of
> pure space.  I'm not sure how many of the others are long-lived or
> potentially large enough to be a concern (with 2100+ loaded NCUs, the
> first one is neither short-lived nor particularly negligible in size)
> I'm not sure how you can determine whether it's an issue in practice.

An alternative might be to just have a bit of bookkeeping for the
reified references when creating or modifying weak entries that could
create cycles to add them to an explicit set (rather than a list of
weak hash tables).  Then if there are no such entries, you could avoid
the loop entirely.  Or at least the loop would be explicitly
proportional in size to the number of entries that might require
traversal, rather than  looking at every entry of every reachable weak
hash table.

Lynn

[Prev in Thread]

Current Thread

[Next in Thread]

Re: GC mark stack, Stefan Monnier, 2022/08/15
- Re: GC mark stack, Lynn Winebarger, 2022/08/15
  - Re: GC mark stack, Stefan Monnier, 2022/08/15
    - Re: GC mark stack, Lynn Winebarger, 2022/08/15
    - Re: GC mark stack, Stefan Monnier, 2022/08/15
    - Re: GC mark stack, Lynn Winebarger, 2022/08/15
    - Re: GC mark stack, Lynn Winebarger <=

Prev by Date: Re: Ideal performance of ELisp
Next by Date: Re: Why tree-sitter instead of Semantic? (was Re: CC Mode with font-lock-maximum-decoration 2)
Previous by thread: Re: GC mark stack
Next by thread: Re: master 629f980fad: Enhance safe_run_hooks_1 and safe_run_hook_funcall to handle more arguments
Index(es):
- Date
- Thread