bug-gnu-emacs
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#60656: 30.0.50; tree-sitter: editing a buffer invalidates visited n


From: Yuan Fu
Subject: bug#60656: 30.0.50; tree-sitter: editing a buffer invalidates visited node instances
Date: Mon, 9 Jan 2023 12:30:20 -0800

Mickey Petersen <mickey@masteringemacs.org> writes:

> Yuan Fu <casouri@gmail.com> writes:
>
>> Mickey Petersen <mickey@masteringemacs.org> writes:
>>
>>> If you parse some text, retrieve a node -- using `treesit-node-at',
>>> for example -- and then edit the buffer, then the node you retrieved
>>> is marked outdated.
>>>
>>> However, tree-sitter is capable of handling that, to a greater or lesser 
>>> extent:
>>>
>>> https://tree-sitter.github.io/tree-sitter/using-parsers#editing
>>>
>>> It is therefore possible to refresh node instances that were created
>>> _before_ the edit. I suppose it could remain an explicit step that you
>>> must enter a special form and then Emacs will track node instances
>>> issued inside that form and refresh them when edits take place inside
>>> of it.
>>>
>>> As it stands, it is very hard to edit and maintain a node registry at
>>> the same time. (I'm using markers and overlays as a crude hack to work
>>> around it.)
>>
>> This is kind of a limitation of tree-sitter. The "node editing" isn’t
>> like what you thought (it fooled me too when I first read it).
>> Tree-sitter’s incremental parsing works roughly like this:
>>
>> 1. You have a parsed tree, TREE, corresponding to some TEXT
>> 2. You make some edit to the TEXT, eg, TEXT’ = insert(TEXT, 1, "abc")
>> 3. Now you need to "edit" the old tree with _positions_ of your edit:
>> edit(TREE, Insert(pos=1, len=3)) (Notice that this modifies the tree 
>> in-place.)
>> 4. You reparse the edited tree and gets a new tree:
>> TREE’ = parse(TREE, TEXT’) (Notice that this returns a new tree.)
>>
>> If you have a NODE from TREE, editing that node only updates position
>> information. That corresponds to the eidt(TREE, ...) step. There is no
>> equivalent of the parse(TREE, TEXT’) step for nodes: once the tree is
>> reparsed and a new tree is returned, none of the nodes in the old tree
>> gets carried to the new tree. In practice, tree-sitter reuses old tree’s
>> data, but conceptually the old and new tree don’t share any node.
>>
>> IOW, the editing feature for nodes is for very specific situations,
>> where you edit the parse tree but didn’t reparse yet. In this case, if
>> you want to make your node’s positions to be correct, you edit the node.
>> But once you reparse, there is no way to somehow "update" this old node
>> into its "equivalent" in the new tree.
>>
>> I’m not sure whether tree-sitter is capable to do what you want (after
>> all the old and new tree are sharing data). But currently it doesn’t
>> expose the feature to do that.
>>
>
> That's a shame. The documentation is a little bit ambiguous then. But
> if the library returns a brand-new tree and thus nodes, then I can see
> why this won't work.

Yeah I wish tree-sitter can have it. Maybe you can raise an issue on
tree-sitter’s github. The author seems to be rather busy, though.

> One possible workaround is that outdated nodes are proxies for their
> underlying data (node type, range, text, anonymous/named) so that
> their actual state is kept around. That will allow `equal' checks to
> still succeed on an outdated and a "brand-new, but identical" node.
>
> Food for thought.

If you can describe what high-level feature you want to accomplish (with
node update), maybe I can provide some suggestions.

Yuan





reply via email to

[Prev in Thread] Current Thread [Next in Thread]