emacs-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Tree Sitter (was Re: cc-mode fontification feels random)


From: Perry E. Metzger
Subject: Re: Tree Sitter (was Re: cc-mode fontification feels random)
Date: Wed, 21 Jul 2021 10:43:37 -0400
User-agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:91.0) Gecko/20100101 Thunderbird/91.0

On 7/20/21 21:28, Stefan Monnier wrote:
Alternately, you can only store the length of each token (as tree-sitter
does); then when processing queries, you have to add up all the lengths
of the preceding tokens in order to report the buffer position of the
information you are computing. That is also linear in the length of the
buffer.
You can probably get better than linear performance in "the usual case"
by storing the total length of the subtree at each node of the AST.
It's still theoretically linear in the worst case, of course.

Thought I would note that there's a substantial literature now on incremental parsing, especially the sort that is needed for editor tools. One doesn't need to reinvent the algorithms, they're out there waiting to be used. The Tree Sitter project is based on previous published work.

There are good links at the end of the https://tree-sitter.github.io/tree-sitter/ web page, but I thought I'd link to some a couple of them directly:

Practical Algorithms for Incremental Software Development Environments
https://www2.eecs.berkeley.edu/Pubs/TechRpts/1997/CSD-97-946.pdf

Incremental Analysis of Real Programming Languages
https://www.semanticscholar.org/paper/Incremental-analysis-of-real-programming-languages-Wagner-Graham/ca69018c29cc415820ed207d7e1d391e2da1656f?p2df

There's also this paper on error recovery for LR parsers, but for some reason it won't load for me right now.
http://www.dtic.mil/dtic/tr/fulltext/u2/a043470.pdf


Perry




reply via email to

[Prev in Thread] Current Thread [Next in Thread]