[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Status update of tree-sitter features
From: |
Yuan Fu |
Subject: |
Status update of tree-sitter features |
Date: |
Wed, 28 Dec 2022 01:44:32 -0800 |
Hi,
As the complete feature freeze approaching, this is probably the last set of
features added to Emacs 29. I stuffed them in just in time ;-)
1. There is a new predicate in the query language, #pred. It’s like #equal and
#match. Basically it allows you to filter the captured node with an arbitrary
function. Right now there are some queries in the font-lock settings that
matches a little more than what we actually want. For example, for the property
feature, we only want the “bb” in “aa.bb”, but not in “aa.bb(cc)”, because the
latter is a method, not property. The query usually matches both. With this new
predicate we can use a function to filter out the methods.
If we can ensure that every query only captures the intended nodes, the
font-lock queries can be reused for context extraction: using the query for the
variable feature, I can find all the variables in a given region, etc.
2. We’ve had treesit-defun-type-regexp for a while, I recently generalized the
idea into “things”. Now you can use treesit—things-around,
treesit—navigate-thing, and treesit—thing-at-point to find and navigate
arbitrary “things”. A “thing” is defined by a regexp that matches the node
types, plus (optionally) a filter function.
3. Now there is imenu support. Major modes don’t need to define their own imenu
functions anymore, they just need to set treesit-simple-imenu-settings. They
also need to set treesit-defun-name-function, which is a function that finds
out the name of a defun node. It is used by both imenu and add-log-entry.
4. C-like modes now have adequate indent and filling for block comments.
Lastly I want to remind everyone to update the font-lock settings for your
major mode to be more complaint to the standard list of features we decided on.
This is not a hard requirement and major modes are free to extend upon it, but
it’s nice to be consistent, especially among built-in modes.
Here is the list, for your reference. Among all the features, I think
assignment is “nice to have”, it’s fine to leave it out if there isn’t enough
time. Same goes for key: it may or may not apply to a language.
Basic tokens:
delimiter ,.; (delimit things)
operator == != || (produces a value)
bracket []{}()
misc-punctuation
constant true, false, null
number
keyword
comment (includes doc-comments)
string (includes chars and docstrings)
string-interpolation f"text {variable}"
escape-sequence "\n\t\\"
function every function identifier
variable every variable identifier
type every type identifier
property a.b <--- highlight b
key { a: b, c: d } <--- highlight a, c
error highlight parse error
Abstract features:
assignment: the LHS of an assignment (thing being assigned to), eg:
a = b <--- highlight a
a.b = c <--- highlight b
a[1] = d <--- highlight a
definition: the thing being defined, eg:
int a(int b) { <--- highlight a
return 0
}
int a; <-- highlight a
struct a { <--- highlight a
int b; <--- highlight b
}
As for decoration levels, this is my suggestion:
'(( comment definition)
( keyword string type)
( assignment builtin constant decorator
escape-sequence key number property string-interpolation)
( bracket delimiter function misc-punctuation operator variable))
Yuan
- Status update of tree-sitter features,
Yuan Fu <=
- Re: Status update of tree-sitter features, Mickey Petersen, 2022/12/28
- Re: Status update of tree-sitter features, Dmitry Gutov, 2022/12/28
- Re: Status update of tree-sitter features, Yuan Fu, 2022/12/28
- Re: Status update of tree-sitter features, Dmitry Gutov, 2022/12/28
- Re: Status update of tree-sitter features, Yuan Fu, 2022/12/29
- Re: Status update of tree-sitter features, Dmitry Gutov, 2022/12/29
- Re: Status update of tree-sitter features, Yuan Fu, 2022/12/30
- Re: Status update of tree-sitter features, Dmitry Gutov, 2022/12/30
- Re: Status update of tree-sitter features, Yuan Fu, 2022/12/31