emacs-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Tree Sitter (was Re: cc-mode fontification feels random)


From: Stephen Leake
Subject: Re: Tree Sitter (was Re: cc-mode fontification feels random)
Date: Tue, 20 Jul 2021 17:04:35 -0700
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/28.0.50 (windows-nt)

"Perry E. Metzger" <perry@piermont.com> writes:

> On 7/19/21 19:49, Stephen Leake wrote:
>> "Perry E. Metzger" <perry@piermont.com> writes:
>>
>>> Apologies for not having been present for, er, the extensive previous
>>> discussion on Tree Sitter. I discovered it looking at the archives.
>> Ok.
>>
>>> I still believe that it would be a great thing to integrate into the
>>> base of Emacs. The algorithms it employs are excellent, it's extremely
>>> fast, and it handles the issues of real editors (like dealing with
>>> partial code fragments and incrementally changing the parse on every
>>> keystroke) very efficiently.
>> +1.
>>
>> I'm working on adding incremental parse to wisi (and have been for
>> almost a year now ...). I believe wisi has stronger error recovery than
>> tree-sitter, which allows it to support indentation.
>
>
> Tree sitter can reparse an entire file in a few milliseconds. This is
> almost impossible to achieve in elisp I suspect. 

Yes. wisi.el is an elisp interface to an external process that is
implemented in Ada. At some point, it will also support an internal
module, which will avoid having to send text; larger files will be
faster. When I finish implementing incremental parse, it should be as
fast as tree-sitter.

> Because of this, it can reparse on every keystroke, which is quite an
> astonishing thing.

There are some reports in the tree-sitter issues of reparsing taking
longer, on large files. So there are some parts of the algorithm that
are proportional to the buffer length, while most of the algorithm is
proportional to the changes length. 

Consider; if the parse tree stores absolute buffer position for each
token, then when you insert 5 chars at the beginning of the buffer, the
buffer position of every node in the tree must be shifted by 5 chars.
That process is linear in the length of the buffer (it can also be very
fast).

Alternately, you can only store the length of each token (as tree-sitter
does); then when processing queries, you have to add up all the lengths
of the preceding tokens in order to report the buffer position of the
information you are computing. That is also linear in the length of the
buffer.

We'll have to see how fast wisi is; I'm making good progress in my
testing, but there are still a few non-incremental algorithms to convert.

-- 
-- Stephe



reply via email to

[Prev in Thread] Current Thread [Next in Thread]