bug-gnu-emacs
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#61369: Problem with keeping tree-sitter parse tree up-to-date


From: Dmitry Gutov
Subject: bug#61369: Problem with keeping tree-sitter parse tree up-to-date
Date: Wed, 15 Feb 2023 04:17:29 +0200
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.4.2

On 14/02/2023 01:59, Yuan Fu wrote:
There are two surprises here: 1) there isn’t an off-by-one bug, 2) the
parser actually read the whole buffer, rather than reading only the new
content. Then there are even less reason for it to create that error
node.

The parser reads the whole buffer, but if it tries to reparse based on the previous parse tree with incorrect positions, it might get into an invalid state as a result.

I've tried gdb-ing treesit_tree_edit_1 (after dropping the 'inline' qualifier), and here's what I see:

- If I paste the test line without the trailing newline or not, the value.

- If I paste the test line with the trailing newline, the value of new_end_byte is still 67. But then it is followed by this call right away:

Thread 1 "emacs" hit Breakpoint 3, treesit_tree_edit_1 (tree=tree@entry=0x5555574139b0, start_byte=start_byte@entry=134, old_end_byte=old_end_byte@entry=134, new_end_byte=135) at treesit.c:739

- If I 'undo' after that, the call is as expected:

Thread 1 "emacs" hit Breakpoint 3, treesit_tree_edit_1 (tree=0x555557435cd0, start_byte=start_byte@entry=0, old_end_byte=old_end_byte@entry=68, new_end_byte=new_end_byte@entry=0) at treesit.c:739
739     {

So I tried again to figure out the odd call, with the backtrace:

Thread 1 "emacs" hit Breakpoint 3, treesit_tree_edit_1 (tree=tree@entry=0x5555575b64f0, start_byte=start_byte@entry=134, old_end_byte=old_end_byte@entry=134, new_end_byte=269) at treesit.c:739
739     {
(gdb) backtrace
#0 treesit_tree_edit_1 (tree=tree@entry=0x5555575b64f0, start_byte=start_byte@entry=134, old_end_byte=old_end_byte@entry=134, new_end_byte=269) at treesit.c:739 #1 0x00005555557cb085 in treesit_sync_visible_region (parser=parser@entry=XIL(0x555556fc329d)) at treesit.c:931 #2 0x00005555557ccf28 in treesit_ensure_parsed (parser=XIL(0x555556fc329d)) at treesit.c:1025
#3  Ftreesit_parser_root_node (parser=XIL(0x555556fc329d)) at treesit.c:1507

treesit.c:739 points to a treesit_tree_edit_1 call which is predicated on this condition:

  if (visible_end < BUF_ZV_BYTE (buffer))

...which shouldn't be the case since the buffer is small enough to fit in the default window. It might already be the consequence of passing the wrong value of new_end_byte to ts_tree_edit, though.

Going back to the first call, the backtrace looks like this:

Thread 1 "emacs" hit Breakpoint 3, treesit_tree_edit_1 (tree=0x5555574f0ff0, start_byte=start_byte@entry=0, old_end_byte=old_end_byte@entry=0, new_end_byte=new_end_byte@entry=67) at treesit.c:739
739     {
(gdb) backtrace
#0 treesit_tree_edit_1 (tree=0x5555574f0ff0, start_byte=start_byte@entry=0, old_end_byte=old_end_byte@entry=0, new_end_byte=new_end_byte@entry=67) at treesit.c:739 #1 0x00005555557cc991 in treesit_record_change (start_byte=1, old_end_byte=1, new_end_byte=69) at treesit.c:806 #2 0x00005555556f8bb7 in insert_from_string_1 (string=XIL(0x55555744c4f4), pos=0, pos_byte=0, nchars=68, nbytes=68, inherit=<optimized out>, before_markers=false) at insdel.c:1084

Seems like treesit_record_change turns new_end_byte=69 into new_end_byte=67 inside treesit_tree_edit_1.

It seems to fail in this calculation:

  ptrdiff_t new_end_offset = (min (visible_end,
                                   max (visible_end, new_end_byte))
                              - visible_beg);

because visible_end is still 68 there. It value gets updated later, closer to the end of this function.





reply via email to

[Prev in Thread] Current Thread [Next in Thread]