--- Begin Message ---
Subject: |
29.0.60; c-ts-mode: short tokens are not identified as type_identifier |
Date: |
Mon, 02 Jan 2023 10:22:09 +0530 |
Short tokens are not identified as type_identifier in GNU Emacs
c-ts-mode, but does work fine with tree-sitter playground[0].
Say for example, 'a_type' in an empty buffer is identified as a
type_identifier in tree-sitter playground, but not in c-ts-mode,
while say, some longer tokens like 'window_type' is identified as
type_identifier.
[0] https://tree-sitter.github.io/tree-sitter/playground
In GNU Emacs 29.0.60 (build 5, x86_64-pc-linux-gnu, GTK+ Version
3.24.35, cairo version 1.16.0) of 2023-01-02 built on purism
Repository revision: 2569ede9c496bb060e0b88428cb541088aaba1f9
Repository branch: emacs-29
Windowing system distributor 'The X.Org Foundation', version
11.0.12101004
System Description: Debian GNU/Linux bookworm/sid
--- End Message ---
--- Begin Message ---
Subject: |
Re: bug#60484: 29.0.60; c-ts-mode: short tokens are not identified as type_identifier |
Date: |
Sat, 7 Jan 2023 16:57:50 -0800 |
Yuan Fu <casouri@gmail.com> writes:
> Eli Zaretskii <eliz@gnu.org> writes:
>
>>> Date: Mon, 02 Jan 2023 18:13:34 +0530
>>> From: Mohammed Sadiq <sadiq@sadiqpk.org>
>>> Cc: 60484@debbugs.gnu.org
>>>
>>> On 2023-01-02 17:45, Eli Zaretskii wrote:
>>> >> Date: Mon, 02 Jan 2023 10:22:09 +0530
>>> >> From: Mohammed Sadiq <sadiq@sadiqpk.org>
>>> >>
>>> >> Short tokens are not identified as type_identifier in GNU Emacs
>>> >> c-ts-mode, but does work fine with tree-sitter playground[0].
>>> >>
>>> >> Say for example, 'a_type' in an empty buffer is identified as a
>>> >> type_identifier in tree-sitter playground, but not in c-ts-mode,
>>> >> while say, some longer tokens like 'window_type' is identified as
>>> >> type_identifier.
>>> >
>>> > Where is it written that FOO_type is a type identifier? is this
>>> > something new in some recent C Standard? Or is it just a popular
>>> > convention?
>>>
>>> 'a_type' was just a made up example, it can be any valid token, say
>>> 'g_file', or whatever. I was pointing out a disparity in handling of
>>> some token in c-ts-mode and tree-sitter: tree-sitter identifiers a 6
>>> byte length token as an identifier, but c-ts-mode requires it to be
>>> at least 11 byte sized for custom types.
>>
>> I'm not sure I see a problem here. It sounds like different
>> heuristics to me. Nothing says that g_file is a type, only its
>> parsing can tell.
>
> The parse tree of a buffer with only a_type in it is this:
>
> (translation_unit (ERROR (identifier)))
>
> So tree-sitter-c parses it as a parse error instead of a type. I suppose
> the difference is due to different version of tree-sitter-c used by
> Emacs (the latest) and the tree-sitter playground website? Maybe the
> playground is using an older version. The "cutoff" point for the
> playground version seems to be 5 bytes: a_typ is considered an error but
> a_type a type.
>
> Yuan
Since it’s a parser problem, I’m closing this.
Yuan
--- End Message ---