[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
bug#60656: 30.0.50; tree-sitter: editing a buffer invalidates visited no
From: |
Mickey Petersen |
Subject: |
bug#60656: 30.0.50; tree-sitter: editing a buffer invalidates visited node instances |
Date: |
Mon, 09 Jan 2023 08:56:57 +0000 |
User-agent: |
mu4e @VERSION@; emacs 30.0.50 |
Yuan Fu <casouri@gmail.com> writes:
> Mickey Petersen <mickey@masteringemacs.org> writes:
>
>> If you parse some text, retrieve a node -- using `treesit-node-at',
>> for example -- and then edit the buffer, then the node you retrieved
>> is marked outdated.
>>
>> However, tree-sitter is capable of handling that, to a greater or lesser
>> extent:
>>
>> https://tree-sitter.github.io/tree-sitter/using-parsers#editing
>>
>> It is therefore possible to refresh node instances that were created
>> _before_ the edit. I suppose it could remain an explicit step that you
>> must enter a special form and then Emacs will track node instances
>> issued inside that form and refresh them when edits take place inside
>> of it.
>>
>> As it stands, it is very hard to edit and maintain a node registry at
>> the same time. (I'm using markers and overlays as a crude hack to work
>> around it.)
>
> This is kind of a limitation of tree-sitter. The "node editing" isn’t
> like what you thought (it fooled me too when I first read it).
> Tree-sitter’s incremental parsing works roughly like this:
>
> 1. You have a parsed tree, TREE, corresponding to some TEXT
> 2. You make some edit to the TEXT, eg, TEXT’ = insert(TEXT, 1, "abc")
> 3. Now you need to "edit" the old tree with _positions_ of your edit:
> edit(TREE, Insert(pos=1, len=3)) (Notice that this modifies the tree
> in-place.)
> 4. You reparse the edited tree and gets a new tree:
> TREE’ = parse(TREE, TEXT’) (Notice that this returns a new tree.)
>
> If you have a NODE from TREE, editing that node only updates position
> information. That corresponds to the eidt(TREE, ...) step. There is no
> equivalent of the parse(TREE, TEXT’) step for nodes: once the tree is
> reparsed and a new tree is returned, none of the nodes in the old tree
> gets carried to the new tree. In practice, tree-sitter reuses old tree’s
> data, but conceptually the old and new tree don’t share any node.
>
> IOW, the editing feature for nodes is for very specific situations,
> where you edit the parse tree but didn’t reparse yet. In this case, if
> you want to make your node’s positions to be correct, you edit the node.
> But once you reparse, there is no way to somehow "update" this old node
> into its "equivalent" in the new tree.
>
> I’m not sure whether tree-sitter is capable to do what you want (after
> all the old and new tree are sharing data). But currently it doesn’t
> expose the feature to do that.
>
That's a shame. The documentation is a little bit ambiguous then. But if the
library returns a brand-new tree and thus nodes, then I can see why this won't
work.
One possible workaround is that outdated nodes are proxies for their
underlying data (node type, range, text, anonymous/named) so that
their actual state is kept around. That will allow `equal' checks to
still succeed on an outdated and a "brand-new, but identical" node.
Food for thought.
> Yuan