[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
bug#60054: 29.0.60; Infinite loop when there are cyclic path in the pars
From: |
Yuan Fu |
Subject: |
bug#60054: 29.0.60; Infinite loop when there are cyclic path in the parse tree |
Date: |
Wed, 14 Dec 2022 12:27:58 -0800 |
> On Dec 14, 2022, at 4:08 AM, Eli Zaretskii <eliz@gnu.org> wrote:
>
>> From: Yuan Fu <casouri@gmail.com>
>> Date: Tue, 13 Dec 2022 16:11:01 -0800
>>
>>
>> This is not really an Emacs bug, but either tree-siter-c or
>> tree-sitter’s. I’m putting it out here so that if I’m hit by a bus
>> tomorrow, and treesit-search-forward-goto and friends hang,
>> we (eh, you) know what’s going on.
>>
>> I’ve submitted an issue here:
>> https://github.com/tree-sitter/tree-sitter-c/issues/119
>>
>> So far, I’ve only observed this in that specific edge case.
>
> We should have protection against that, which should be easy, right?
Just to make sure, we want to use something like slow-fast pointers, where we
have two pointers, and one goes twice as fast, right? That’s the one I was
taught in school :-)
The author advices to use cursors for traversing the tree, as cursors doesn’t
have this bug. He also advised against using ts_node_parent and said that they
could be removed in the future. I didn’t use cursors in the first place because
they can’t traverse the tree backwards, ie, no equivalent of
ts_node_prev_sibling, and the performance difference is not significant in
Emacs settings. But I just went to look at the source, and it seems
ts_node_prev_siblings(node) is implemented by just iterating each children from
first to last, until it finds the child just before NODE, LOL[1]. I can do
similar things in treesit.c with cursors. By doing that we can fix this problem
and be future-proof.
In summary, I’m proposing:
1. I add the slow-fast pointer checks in treesit.c and treesit.el
2. I replace ts_node_parent/sibling/child with using cursors in tree-traversal
functions in treesit.c.
Yuan
[1]
static inline TSNode ts_node__prev_sibling(TSNode self, bool include_anonymous)
{
Subtree self_subtree = ts_node__subtree(self);
bool self_is_empty = ts_subtree_total_bytes(self_subtree) == 0;
uint32_t target_end_byte = ts_node_end_byte(self);
TSNode node = ts_node_parent(self);
TSNode earlier_node = ts_node__null();
bool earlier_node_is_relevant = false;
while (!ts_node_is_null(node)) {
TSNode earlier_child = ts_node__null();
bool earlier_child_is_relevant = false;
bool found_child_containing_target = false;
TSNode child;
NodeChildIterator iterator = ts_node_iterate_children(&node);
while (ts_node_child_iterator_next(&iterator, &child)) {
if (child.id == self.id) break;
if (iterator.position.bytes > target_end_byte) {
found_child_containing_target = true;
break;
}
if (iterator.position.bytes == target_end_byte &&
(!self_is_empty ||
ts_subtree_has_trailing_empty_descendant(ts_node__subtree(child),
self_subtree))) {
found_child_containing_target = true;
break;
}
if (ts_node__is_relevant(child, include_anonymous)) {
earlier_child = child;
earlier_child_is_relevant = true;
} else if (ts_node__relevant_child_count(child, include_anonymous) > 0) {
earlier_child = child;
earlier_child_is_relevant = false;
}
}
if (found_child_containing_target) {
if (!ts_node_is_null(earlier_child)) {
earlier_node = earlier_child;
earlier_node_is_relevant = earlier_child_is_relevant;
}
node = child;
} else if (earlier_child_is_relevant) {
return earlier_child;
} else if (!ts_node_is_null(earlier_child)) {
node = earlier_child;
} else if (earlier_node_is_relevant) {
return earlier_node;
} else {
node = earlier_node;
}
}
return ts_node__null();
}