emacs-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Overlays as an AA-tree


From: Stefan Monnier
Subject: Re: Overlays as an AA-tree
Date: Tue, 21 Feb 2017 14:18:44 -0500
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/26.0.50 (gnu/linux)

>   /* master */
>   struct Lisp_Overlay
>   {
>       ENUM_BF (Lisp_Misc_Type) type : 16;
>       bool_bf gcmarkbit : 1;
>       unsigned spacer : 15;
>       struct Lisp_Overlay *next;
>       Lisp_Object start;          /* -> 40 bytes */
>       Lisp_Object end;            /* -> 40 bytes */
>       Lisp_Object plist;          /* -> 40 bytes */
>   };  /* 40 bytes */
>   /* Total: 40 + 3*40 = 160 */

Hmm... I don't know how you're counting either ;-)

the type+gcmarkbit+spacer bit fields sum up to 32bits (let's round that
to a "word").
Then we have 4 fields that hold a "word" (Lisp_Object is basically an
integer), for a total of 5 words.  On 32bit systems, alloc.c rounds it
up further to 6 words or 24B (because Lisp_Objects have to be aligned on
a multiple of 8B).

>   /* noverlay */
>   struct Lisp_Overlay
>   {
>       ENUM_BF (Lisp_Misc_Type) type : 16;
>       bool_bf gcmarkbit : 1;
>       unsigned spacer : 15;
>       Lisp_Object plist;          /* -> 40 bytes */
>       struct buffer *buffer;
>       struct interval_node *interval;
>   };  /* 32 bytes */

So that's 4 words (one less than before).

>   struct interval_node            /* layout tweaked */
>   {
>       struct interval_node *parent;
>       struct interval_node *left;
>       struct interval_node *right;
>       ptrdiff_t begin;         /* The beginning of this interval. */
>       ptrdiff_t end;           /* The end of the interval. */
>       ptrdiff_t limit;         /* The maximum end in this subtree. */
>       ptrdiff_t offset;        /* The amount of shift to apply to this 
> subtree. */
>       uintmax_t otick;         /* offset modified tick */
>       Lisp_Object data;        /* Exclusively used by the client. */
>       bool_bf visited : 1;     /* For traversal via generator. */
>       bool_bf rear_advance : 1;   /* Same as for marker and overlays.  */
>       bool_bf front_advance : 1;  /* Same as for marker and overlays.  */
>       enum { ITREE_RED, ITREE_BLACK } color : 1;
>   };  /* 80 bytes */
>   /* Total: 32 + 40 + 80 = 152 */

Here I count 9 words plus some bit fields (i.e. 10 words).  I assume
there's one "struct interval_node" per overlay object, so every overlay
object uses 14 words (plus 2 because all Lisp_Misc end up using 6 words
anyway) which is indeed less than the current setup.

Is there a reason why "struct Lisp_Overlay" and "struct interval_node"
are separate?  IOW could we have a single struct with all the
fields together (as a Lisp_Vectorlike rather than a Lisp_Misc)?
If so, we could bring this down to 12 words plus the Lisp_Vectorlike
header, i.e. 14 words.
[ Of course, we could probably squeeze it a bit further if we really
  care, e.g. merge the `parent' and `buffer' fields, at the cost of
  walking up the tree when we need to find the overlay's buffer.
  But I doubt it's worthwhile.  ]

That means an overlay takes up about twice the size of a marker.

Should we then (re)implement markers as degenerate overlays?  [it's not
as simple as it sounds: contrary to overlays markers can be GC'd if not
referenced, so we'd need to add support for "weakly referenced
overlays".  Furthermore markers are used to speed up byte<->char
conversions, so we'd need to replace that code accordingly.  ]


        Stefan




reply via email to

[Prev in Thread] Current Thread [Next in Thread]