emacs-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Exposing buffer text modifications to Lisp (was: Tree-sitter integra


From: Ihor Radchenko
Subject: Re: Exposing buffer text modifications to Lisp (was: Tree-sitter integration on feature/tree-sitter)
Date: Sat, 18 Jun 2022 13:52:59 +0800

Eli Zaretskii <eliz@gnu.org> writes:

> [I've changed the Subject, since this is not longer about tree-sitter.]

Well. I had some hope that we can generalize the tree-sitter interface
to allow Elisp-based parsers, but it is just a wish.

> OK, but that still doesn't tell what you need from the Emacs core.
> Can you describe those needs?  I presume that modification hooks (of
> any kind) are just the means; the real need is something else.  What
> is it?  If (as I presume) you need to know about changes to the
> buffer, then can you enumerate the changes that are of interest?  For
> example, are changes in text properties and overlays of interest, and
> if so, what kind of properties/overlays?  (But please don't limit your
> answers to just text properties and overlays, because I asked about
> them explicitly.)

Valid question. I am a bit too familiar with Org parser code and assume
that some things are "obvious" when they are not.

I will first answer about AST.

> Next, what kind of ASTs do you want to build, and how do you
> represent text as AST?  In particular, is the AST defined by regexps
> or some other Lisp data structures?

Org AST represents semantic objects using nested lists.
Similar to tree-sitter (AFAIU), each object in the tree is represented
by

(object-type (object-plist) object-children ...)

for example:

* test headline :tag:

is represented as

(headline
  (:raw-value "test headline" :begin 292 :end 314 ... :tags ("tag") ... :parent 
(...))
  ;; no children
   )

Upon modifying text inside the headline, we need to update :begin/:end
properties to reflect the new headline boundaries in buffer and possibly
update headline properties (e.g. :tags).

The same should be done for all the elements containing the headline.

Updating the elements require the following information:

1. Whether modified text contained terminal symbols or text contributing
   to object-plist _before_ modification.
2. The boundaries of the edited text in buffer and change in the text
   length.
3. Whether the modified text contain terminal symbols/text contributing
   to object-plist _after_ modification.

Org does not care about text property changes or overlay changes.
We just perform a series of regexp searches over the changed parts of
buffer (possibly with extended boundaries) before and after the
modification + know which region of text has been modified (its begin,
end, and change in length).

Missing any significant change (the one involving terminal symbols or
changing region length) will make the AST invalid.

Hope it clarifies the needs.

Best,
Ihor



reply via email to

[Prev in Thread] Current Thread [Next in Thread]