emacs-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: noverlay branch


From: Matt Armstrong
Subject: Re: noverlay branch
Date: Thu, 06 Oct 2022 13:41:08 -0700

Stefan Monnier <monnier@iro.umontreal.ca> writes:

> I just updated the noverlay branch to the code in `master` (most of it
> was mechanical except for the fact that since the last commit on that
> branch we have gotten rid of Lisp_Misc and we moved to pdumper).
>
> I'm getting ready to install this in `master`, so I'd encourage people
> to try this code as much as possible to try and weed out the most
> glaring problems before it hits master.  The code generally looks good,
> but it touches some quite "core" code in the redisplay (with lots of
> off-by-one opportunities) and in the memory management (with
> opportunities for crashes and memory leaks).
>
> For those who're not familiar with this branch, it changes the way
> overlays are stored in a buffer: instead of keeping them in naïve
> singly-linked lists, it keeps them in balanced binary trees, so as to
> replace an O(N) complexity with O(log N). This way Emacs should not get
> sluggish even with millions of overlays (of course, if a given buffer
> position is covered by a million overlays, it'll still be sluggish when
> operating around that position).
>
>
>         Stefan

Stefan, I don't have Emacs commit so I've attached comment improvement
patches here.  Lemme know if you'd prefer this occur on a bug.  These
commits are also on https://git.sr.ht/~matta/emacs/log/feature/noverlay

>From 6dff825a9943434cfccd64916c506ab10977acf8 Mon Sep 17 00:00:00 2001
From: Matt Armstrong <matt@rfc20.org>
Date: Thu, 6 Oct 2022 09:36:24 -0700
Subject: [PATCH 1/4] ; * src/itree.h: include "lisp.h" for Lisp_Object

---
 src/itree.c | 2 +-
 src/itree.h | 2 ++
 2 files changed, 3 insertions(+), 1 deletion(-)

diff --git a/src/itree.c b/src/itree.c
index ed31ef1156..a782410860 100644
--- a/src/itree.c
+++ b/src/itree.c
@@ -19,7 +19,7 @@ Copyright (C) 2017-2022  Free Software Foundation, Inc.
 
 #include <config.h>
 #include <math.h>
-#include "lisp.h"
+
 #include "itree.h"
 
 /*
diff --git a/src/itree.h b/src/itree.h
index 8f6bb667d6..9b79551f77 100644
--- a/src/itree.h
+++ b/src/itree.h
@@ -23,6 +23,8 @@ #define ITREE_H
 #include <stddef.h>
 #include <inttypes.h>
 
+#include "lisp.h"
+
 /* The tree and node structs are mainly here, so they can be allocated.
 
    NOTE: The only time where it is safe to modify node.begin and
-- 
2.35.1

>From 4048988f24c60104e6658b164a34df752f7b6167 Mon Sep 17 00:00:00 2001
From: Matt Armstrong <matt@rfc20.org>
Date: Thu, 6 Oct 2022 13:05:19 -0700
Subject: [PATCH 2/4] ; * src/itree.h (struct interval_node): document field
 invariants.

---
 src/itree.h | 58 ++++++++++++++++++++++++++++++++++++++++++++---------
 1 file changed, 49 insertions(+), 9 deletions(-)

diff --git a/src/itree.h b/src/itree.h
index 9b79551f77..52219a8eef 100644
--- a/src/itree.h
+++ b/src/itree.h
@@ -25,22 +25,62 @@ #define ITREE_H
 
 #include "lisp.h"
 
-/* The tree and node structs are mainly here, so they can be allocated.
-
-   NOTE: The only time where it is safe to modify node.begin and
-   node.end directly, is while the node is not part of any tree.
-
-   NOTE: It is safe to read node.begin and node.end directly, if the
-   node came from a generator, because it validates the nodes it
-   returns as a side-effect.
-*/
+/* NOTE: the fields of the node and tree structs should for the most
+ * part be treated as opaque to the rest of Emacs.  Exceptions are
+ * noted in comments.  They are in the header file so they can be
+ * allocated.
+ */
 
 struct interval_node;
 struct interval_node
 {
+  /* The normal parent, left and right links found in binary trees.
+     See also `red`, below, which completes the Red-Black tree
+     representation.  */
   struct interval_node *parent;
   struct interval_node *left;
   struct interval_node *right;
+
+  /* The following five fields comprise the interval abstraction.
+
+     BEGIN, END are buffer positions.  BEGIN and END are the beginning
+     and end of this interval.  These form an inclusive, exlusive (or
+     closed, open) range of buffer positions [BEGIN..END).  When a
+     node is in a tree these fields are read only, written only by
+     itree functions.
+
+     The LIMIT, OFFSET and OTICK fields should be considered internal
+     to itree.c and used only by itree functions.
+
+     LIMIT is a buffer position, the maximum of END of this node and
+     its children.  See itree.c for its use.
+
+     OFFSET is in buffer position units, and will be non-zero only
+     when the node is dirty.
+
+     OTICK determines whether BEGIN, END, LIMIT and OFFSET are
+     considered dirty.  A node is clean when its OTICK is equal to the
+     OTICK of its tree (see struct interval_tree).  Otherwise, it is
+     dirty.
+
+     In a clean node, BEGIN, END and LIMIT are correct buffer
+     positions, and OFFSET is zero.  The parent of a clean node is
+     clean also clean, recursively.
+
+     In a dirty node, the node's OTICK won't equal its tree's OTICK,
+     and its OFFSET may be non-zero.  At all times the descendents of
+     a dirty node are also dirty.  BEGIN, END and LIMIT require
+     adjustment before use as buffer positions.
+
+     NOTE: BEGIN and END must not be modified while the node is part
+     of a tree.  Use interval_tree_insert_gap and
+     interval_tree_delete_gap instead.
+
+     NOTE: The interval generators ensure nodes are clean before
+     yielding them, so BEGIN and END may be safely used as buffer
+     positions then.
+  */
+
   ptrdiff_t begin;             /* The beginning of this interval. */
   ptrdiff_t end;               /* The end of the interval. */
   ptrdiff_t limit;             /* The maximum end in this subtree. */
-- 
2.35.1

>From 52470aa06f5f037358d957271101b8dbaa3f44e7 Mon Sep 17 00:00:00 2001
From: Matt Armstrong <matt@rfc20.org>
Date: Thu, 6 Oct 2022 13:12:54 -0700
Subject: [PATCH 3/4] ; * src/itree.c: change comments for clarity.

---
 src/itree.c | 18 +++++++++---------
 1 file changed, 9 insertions(+), 9 deletions(-)

diff --git a/src/itree.c b/src/itree.c
index a782410860..3098fe1cf4 100644
--- a/src/itree.c
+++ b/src/itree.c
@@ -24,7 +24,7 @@ Copyright (C) 2017-2022  Free Software Foundation, Inc.
 
 /*
    Intervals of the form [BEGIN, END), are stored as nodes inside a RB
-   tree, sorted by BEGIN .  The core operation of this tree (besides
+   tree, ordered by BEGIN.  The core operation of this tree (besides
    insert, remove, etc.) is finding all intervals intersecting with
    some given interval.  In order to perform this operation
    efficiently, every node stores a third value called LIMIT. (See
@@ -65,10 +65,10 @@ Copyright (C) 2017-2022  Free Software Foundation, Inc.
    ==== Adjusting intervals ====
 
    Since this data-structure will be used for overlays in an Emacs
-   buffer, a second core operation implements the ability to insert or
-   delete gaps in resp. from the tree.  This models the insertion
-   resp. deletion of text in a buffer and the effects it may have on
-   the positions of overlays.
+   buffer, a second core operation is the ability to insert and delete
+   gaps in the tree.  This models the insertion and deletion of text
+   in a buffer and the effects it may have on the positions of
+   overlays.
 
    Consider this: Something gets inserted at position P into a buffer
    and assume that all overlays occur strictly after P.  Ordinarily,
@@ -79,10 +79,10 @@ Copyright (C) 2017-2022  Free Software Foundation, Inc.
 
    The OFFSET of some some subtree, represented by its root, is the
    amount of shift that needs to be applied to its BEGIN, END and
-   LIMIT values, in order to get to the real values.  Coming back to
-   the example, all we would need to do in this case, is to increment
-   the OFFSET of the tree's root, without any traversal of the tree
-   itself.
+   LIMIT values, in order to get to the actual buffer positions.
+   Coming back to the example, all we would need to do in this case,
+   is to increment the OFFSET of the tree's root, without any
+   traversal of the tree itself.
 
    As a consequence, the real values of BEGIN, END and LIMIT of some
    NODE need to be computed by incrementing them by the sum of NODE's
-- 
2.35.1

>From 53238b6c16f4dc3dec8d60205e88f5f79200d187 Mon Sep 17 00:00:00 2001
From: Matt Armstrong <matt@rfc20.org>
Date: Thu, 6 Oct 2022 13:18:46 -0700
Subject: [PATCH 4/4] Use a bool instead of a bitfield

* src/itree.c (struct interval_generator): use a bool instead of a
bitfield, since space is not an issue.
---
 src/itree.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/itree.c b/src/itree.c
index 3098fe1cf4..79e39d6e2a 100644
--- a/src/itree.c
+++ b/src/itree.c
@@ -145,7 +145,7 @@ null_is_sane (void)
   ptrdiff_t end;
   uintmax_t otick;              /* A copy of the tree's `otick`.  */
   enum interval_tree_order order;
-  bool_bf running : 1;
+  bool running;
   const char* file;
   int line;
 };
-- 
2.35.1


reply via email to

[Prev in Thread] Current Thread [Next in Thread]