[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Update on tree-sitter structure navigation
From: |
Yuan Fu |
Subject: |
Re: Update on tree-sitter structure navigation |
Date: |
Sat, 2 Sep 2023 15:12:32 -0700 |
> On Sep 2, 2023, at 1:50 AM, Hugo Thunnissen <devel@hugot.nl> wrote:
>
> Ihor Radchenko <yantar92@posteo.net> writes:
>
>> Yuan Fu <casouri@gmail.com> writes:
>>
>>> In the months after wrapping up tree-sitter stuff in emacs-29, I was
>>> thinking about how to implement structural navigation and extracting
>>> information from the parser with tree-sitter. In emacs-29 we have
>>> things like treesit-beginning/end-of-defun, and treesit-defun-name. I
>>> was thinking maybe we can generalize this to support getting arbitrary
>>> “thing” at point, move around them, and getting information like the
>>> name of a defun, its arglist, parent of a class, type of an variable
>>> declaration, etc, in a language-agnostic way.
>>
>> Note that Org mode also does all of these using
>> https://orgmode.org/worg/dev/org-element-api.html
>>
>> It would be nice if we could converge to more consistent interface
>> across all the modes. For example, by extending `thing-at-point' to handle
>> parsed elements, not just simplistic regexp-based "thing" boundaries
>> exposed by `thing-at-point' now.
>>
>> Org approaches getting name/begin/end/arguments using a common API:
>>
>> (org-element-property :begin NODE)
>> (org-element-property :end NODE)
>> (org-element-property :contents-begin NODE)
>> (org-element-property :contents-end NODE)
>> (org-element-property :name NODE)
>> (org-element-property :args NODE)
>>
>> Language-agnostic "thing"s will certainly be welcome, especially given
>> that tree-sitter grammars use inconsistent naming schemes, which have to
>> be learned separately, and may even change with grammar versions.
>>
>> I think that both NODE types and attributes can be standardized.
>>
>
> It would be great to see standardization that can work with more than
> just tree-sitter. Depending on how extensive such a generic NODE type
> and accompanying API are, I could see standardization of a lot of things
> that are currently implemented in major modes, to name a few:
>
> - indentation
> - fontification
> - thing-at-point
> - imenu
> - simple forms of completion (variables, function names in buffer)
>
> I have some idea of the underpinnings, but I have never implemented a
> full major mode so it is hard for me to judge the practicality of
> this. How much would be practical to standardize, without needlessly
> complicated/resource-heavy abstractions?
I don’t know which level of standardization you are thinking about, but aren’t
they already standardized?
- indentation: indent-line/region-function
- fontification: font-lock-defaults
- thing-at-point: thing-at-point function
- imenu: imenu-create-index-function
- completion: completion-at-point-function
Yuan
- Update on tree-sitter structure navigation, Yuan Fu, 2023/09/02
- Re: Update on tree-sitter structure navigation, Yuan Fu, 2023/09/02
- Re: Update on tree-sitter structure navigation, Yuan Fu, 2023/09/07
- Re: Update on tree-sitter structure navigation, Ihor Radchenko, 2023/09/08
- Re: Update on tree-sitter structure navigation, Yuan Fu, 2023/09/08