emacs-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Tree-sitter indentation for js-mode & cc-mode


From: Yuan Fu
Subject: Re: Tree-sitter indentation for js-mode & cc-mode
Date: Thu, 27 Oct 2022 08:21:43 -0700


> On Oct 27, 2022, at 2:11 AM, Theodor Thornhill <theo@thornhill.no> wrote:
> 
> 
> Hi Yuan!
> 
>> I did some work to allow tree-sitter indentation engine to plug in to
>> c-offset-alist. Currently in a tree-sitter indent rule, we have
>> 
>> (MATCHER ANCHOR OFFSET)
>> 
>> OFFSET is normally an integer, but now it can also be a syntax symbol
>> recognized by cc-mode’s indentation engine. In that case, tree-sitter
>> indent calculates the indent using c-calc-offset, passing the syntax
>> symbol and anchor position to it, and c-calc-offset will give us the
>> integer offset based on c-offset-alist.
>> 
> 
> This is cool, but do we really want/need this?  I mean, now we're really
> binding these implementations together and allowing all the legacy of CC
> mode to blend in.  We also need knowledge of how CC mode names their
> syntactic definitions.  IMO one of the big selling points of tree sitter
> is that you can look at other editors implementation and get inspired
> immediately.  Now we need deep knowledge of cc mode, don't we?  Also,
> why would we want cc mode to calculate this for us?  I see what you're
> trying to do, but _I_ think this is a step in the wrong direction.

You have a point. I tried to blend in cc-mode because that’ll allow us support 
“styles” and existing user customization. (Also I started out thinking it will 
be easier to write indentation rules this way which turns out to be not true.) 
Perhaps it’s better to come up with a new system for customizing indentation 
style. I’ll revert this change.

> 
> 
>> I’ve written indent rules for js-mode, they are in
>> js-treesit-cc-indent-rules. Overall it works pretty well. Theo, could
>> you give it a try? From my testing it is already an improvement from
>> the original rules. I didn’t finish the JSX part and just copied your
>> original rules for JSX. In the future I can probably port that to
>> cc-style too.
>> 
> 
> I don't think this is better, for example:
> 
> The old variant renders this snippet correctly:
> ```
> const fooClient = new Foo({
>  bucket: process.env.foo,
>  region: process.env.foo,
> });
> ```
> 
> But the new one renders it like this:
> ```
> const fooClient = new Foo({
>                            bucket: process.env.foo,
>                          region: process.env.foo,
>                          });
> ```
> 
> I know this is a matter of tweaking, but it immediately makes me
> question the reasoning to blend them.

It’s largely my slip-up rather than inherit defect of the system, but I agree 
with your opinion above.

> 
> In addition I profiled indenting a 50k lines js file with messed up
> indentation, and received some surprising results.  The pure cc mode
> variant is slow, but MUCH faster than tree-sitter.  It seems we are
> looking up way to much the root of the tree, but you know the internals
> here better than me.  Is this something we can optimize away? See the
> attached report at the bottom.

This is very strange, I need to look into it.

> 
>> I also added imenu support for js-mode and ts-mode, and navigation for
>> python-mode.
>> 
> 
> Very cool - it seems to work pretty nicely!

Nice.

> 
> Anyways - can we please revisit the idea that we init and/or use cc mode
> in tandem with tree-sitter?  I know we want "feature parity", but I
> think we lose too much of the simplicity gained by adding in the old
> complexity.  My prediction for the future is that this will result in
> numerous bug reports where it's hard to know whether this is a fix for
> cc mode or treesitter.  And in the end people _will_ just skip these
> modes altogether and put simpler ones in (m)elpa that only uses treesit,
> to avoid this.  The cc mode won't go away at all, for the people that
> considers that superior.  We can still use the treesit-settings as a
> centralized variable to get most, if not all of the "auto-enabling"
> benefits by just lifting it up:
> 
> ```
>  ;;....
> 
>  (cond
>   ;; Tree-sitter.
>   ((treesit-ready-p 'js-mode 'javascript)
>    ;; init all treesitter relevant stuff - can add in _some_ other
>    ;; non-cc-mode settinigs, such as comment-start, etc above this.
>    ;; We don't need the cache, detection of js-jsx or any of the
>    ;; before-change-functions
> 
>    (treesit-major-mode-setup))
>   ;; Elisp.
>   (t
>     ;; enable in normal cc mode stuff
>    )))
> 
> ```
> 
> This way other hypothetical tree-sitter-v2 in the future is just a
> simple cond, and no need to worry.
> 
> If I'm missing something important here, please let me know, but I
> _really_ don't understand the reason for merging these implementations.

I don’t have an educated opinion on this. If no one has objections I’ll follow 
your professional advice ;-)

> Anyway, thanks for your continued hard work!

Many thanks to you, too!

> 
> Theo
> 
> <indent-report.txt>




reply via email to

[Prev in Thread] Current Thread [Next in Thread]