Re: Standardizing tree-sitter fontification features

From: Randy Taylor
Subject: Re: Standardizing tree-sitter fontification features
Date: Fri, 25 Nov 2022 01:13:46 +0000

On Thursday, November 24th, 2022 at 17:16, Yuan Fu <casouri@gmail.com> wrote:

> For tree-sitter-based major modes, fontification rules are categorized into 
> “features”, which can be individually turned on/off. I think it would be good 
> to have a standardized list of common features and their precise meaning 
> defined. We’ve been working on these fontification rules for some time and 
> arrived at a reasonable baseline, and now it’s a good time to discuss and 
> bless it, I think.
> Right now we have:
> Basic tokens:
> delimiter ,.;
> operator = != ||
> bracket []{}()
> constant true, false, null
> number
> keyword
> comment
> string
> string-interpolation f"text {variable}"
> escape-sequence "\n\t\\"
> function every function identifier
> variable every variable identifier
> type every type identifier
> property a.b <--- highlight b
> key { a: b, c: d } <--- highlight a, c
> error highlight parse error
> More abstract ones:
> assignment: the LHS of an assignment (thing being assigned to), eg:
> a = b <--- highlight a
> a.b = c <--- highlight b
> a[1] = d <--- highlight a
> definition: the thing being defined, eg:
> int a(int b) { <--- highlight a
> return 0
> }
> int a; <-- highlight a
> struct a { <--- highlight a
> int b; <--- highlight b
> }
> There are also language-specific features, but they are not the focus here.
> Once we agree on a list of standard features and their definition, the next 
> step would be to figure out how should a major mode introduce its supported 
> features to a user (major mode docstring + link to manual for standard 
> features?).
> Also, some of the features are very busy, it would be good if we can disable 
> they by default. The default value of font-lock-maximum-decoration is t, 
> meaning use everything, which is not very helpful...
> Yuan

Looks good!

key should be considered property IMO, and that's how we're highlighting things 

I wonder if assignment and definition are really worth having (and would prefer 
to do without them), since they should be covered by the variable, function, 
type and property features.

I would also add:
- misc-punctuation, for anything not considered a delimiter or bracket. Most 
modes would use this for any special punctuation they've got.
- (maybe) literal instead of number? That way there is a group for chars too 
(and any other literals if there are any?). Or a char feature in addition to 
the existing number one. I'm undecided...

Maybe a slight tangent but I also suggest we alphabetize all of these; both the 
queries and the list of features. I'll send a patch to do that myself once 
things cool down a bit. Although anything that overrides will need to go at the 
bottom to make sure it gets applied.

