emacs-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Tree-sitter indentation for js-mode & cc-mode


From: Theodor Thornhill
Subject: Re: Tree-sitter indentation for js-mode & cc-mode
Date: Thu, 27 Oct 2022 11:11:01 +0200

Hi Yuan!

> I did some work to allow tree-sitter indentation engine to plug in to
> c-offset-alist. Currently in a tree-sitter indent rule, we have
>
> (MATCHER ANCHOR OFFSET)
>
> OFFSET is normally an integer, but now it can also be a syntax symbol
> recognized by cc-mode’s indentation engine. In that case, tree-sitter
> indent calculates the indent using c-calc-offset, passing the syntax
> symbol and anchor position to it, and c-calc-offset will give us the
> integer offset based on c-offset-alist.
>

This is cool, but do we really want/need this?  I mean, now we're really
binding these implementations together and allowing all the legacy of CC
mode to blend in.  We also need knowledge of how CC mode names their
syntactic definitions.  IMO one of the big selling points of tree sitter
is that you can look at other editors implementation and get inspired
immediately.  Now we need deep knowledge of cc mode, don't we?  Also,
why would we want cc mode to calculate this for us?  I see what you're
trying to do, but _I_ think this is a step in the wrong direction.


> I’ve written indent rules for js-mode, they are in
> js-treesit-cc-indent-rules. Overall it works pretty well. Theo, could
> you give it a try? From my testing it is already an improvement from
> the original rules. I didn’t finish the JSX part and just copied your
> original rules for JSX. In the future I can probably port that to
> cc-style too.
>

I don't think this is better, for example:

The old variant renders this snippet correctly:
```
const fooClient = new Foo({
  bucket: process.env.foo,
  region: process.env.foo,
});
```

But the new one renders it like this:
```
const fooClient = new Foo({
                            bucket: process.env.foo,
                          region: process.env.foo,
                          });
```

I know this is a matter of tweaking, but it immediately makes me
question the reasoning to blend them.

In addition I profiled indenting a 50k lines js file with messed up
indentation, and received some surprising results.  The pure cc mode
variant is slow, but MUCH faster than tree-sitter.  It seems we are
looking up way to much the root of the tree, but you know the internals
here better than me.  Is this something we can optimize away? See the
attached report at the bottom.


> I also added imenu support for js-mode and ts-mode, and navigation for
> python-mode.
>

Very cool - it seems to work pretty nicely!

Anyways - can we please revisit the idea that we init and/or use cc mode
in tandem with tree-sitter?  I know we want "feature parity", but I
think we lose too much of the simplicity gained by adding in the old
complexity.  My prediction for the future is that this will result in
numerous bug reports where it's hard to know whether this is a fix for
cc mode or treesitter.  And in the end people _will_ just skip these
modes altogether and put simpler ones in (m)elpa that only uses treesit,
to avoid this.  The cc mode won't go away at all, for the people that
considers that superior.  We can still use the treesit-settings as a
centralized variable to get most, if not all of the "auto-enabling"
benefits by just lifting it up:

```
  ;;....

  (cond
   ;; Tree-sitter.
   ((treesit-ready-p 'js-mode 'javascript)
    ;; init all treesitter relevant stuff - can add in _some_ other
    ;; non-cc-mode settinigs, such as comment-start, etc above this.
    ;; We don't need the cache, detection of js-jsx or any of the
    ;; before-change-functions
  
    (treesit-major-mode-setup))
   ;; Elisp.
   (t
     ;; enable in normal cc mode stuff
    )))

```

This way other hypothetical tree-sitter-v2 in the future is just a
simple cond, and no need to worry.

If I'm missing something important here, please let me know, but I
_really_ don't understand the reason for merging these implementations.

Anyway, thanks for your continued hard work!

Theo

Attachment: indent-report.txt
Description: Text document


reply via email to

[Prev in Thread] Current Thread [Next in Thread]