emacs-diffs
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

emacs-29 197f994384c: Document tree-sitter features in the user manual


From: Eli Zaretskii
Subject: emacs-29 197f994384c: Document tree-sitter features in the user manual
Date: Sun, 29 Jan 2023 08:23:15 -0500 (EST)

branch: emacs-29
commit 197f994384cb37ae4ae7a771815bbe565d4ae242
Author: Eli Zaretskii <eliz@gnu.org>
Commit: Eli Zaretskii <eliz@gnu.org>

    Document tree-sitter features in the user manual
    
    * lisp/progmodes/c-ts-mode.el (c-ts-mode-map): Bind "C-c .", for
    consistency with CC mode.
    * lisp/treesit.el (treesit-font-lock-level): Doc fix.
    
    * doc/emacs/programs.texi (C Indent, Custom C Indent): Document
    the indentation features of 'c-ts-mode'.
    (Moving by Defuns): Document 'treesit-defun-tactic'.
    * doc/emacs/files.texi (Visiting): Document
    'treesit-max-buffer-size'.
    * doc/emacs/display.texi (Traditional Font Lock)
    (Parser-based Font Lock): New subsections.
    * doc/emacs/emacs.texi (Top): Update top-level menu.
---
 doc/emacs/display.texi      | 131 +++++++++++++++++++++++++++++++++++++-------
 doc/emacs/emacs.texi        |   4 ++
 doc/emacs/files.texi        |  11 ++++
 doc/emacs/programs.texi     |  42 ++++++++++----
 lisp/progmodes/c-ts-mode.el |   3 +-
 lisp/treesit.el             |  15 +++--
 6 files changed, 170 insertions(+), 36 deletions(-)

diff --git a/doc/emacs/display.texi b/doc/emacs/display.texi
index f77ab569483..97732b65e32 100644
--- a/doc/emacs/display.texi
+++ b/doc/emacs/display.texi
@@ -1024,17 +1024,65 @@ customize-group @key{RET} font-lock-faces @key{RET}}.  
You can then
 use that customization buffer to customize the appearance of these
 faces.  @xref{Face Customization}.
 
+@cindex just-in-time (JIT) font-lock
+@cindex background syntax highlighting
+  Fontifying very large buffers can take a long time.  To avoid large
+delays when a file is visited, Emacs initially fontifies only the
+visible portion of a buffer.  As you scroll through the buffer, each
+portion that becomes visible is fontified as soon as it is displayed;
+this type of Font Lock is called @dfn{Just-In-Time} (or @dfn{JIT})
+Lock.  You can control how JIT Lock behaves, including telling it to
+perform fontification while idle, by customizing variables in the
+customization group @samp{jit-lock}.  @xref{Specific Customization}.
+
+  The information that major modes use for determining which parts of
+buffer text to fontify and what faces to use can be based on several
+different ways of analyzing the text:
+
+@itemize @bullet
+@item
+Search for keywords and other textual patterns based on regular
+expressions (@pxref{Regexp Search,, Regular Expression Search}).
+
+@item
+Find syntactically distinct parts of text based on built-in syntax
+tables (@pxref{Syntax Tables,,, elisp, The Emacs Lisp Reference
+Manual}).
+
+@item
+Use syntax tree produced by a full-blown parser, via a special-purpose
+library, such as the tree-sitter library (@pxref{Parsing Program
+Source,,, elisp, The Emacs Lisp Reference Manual}), or an external
+program.
+@end itemize
+
+@menu
+* Traditional Font Lock::  Font Lock based on regexps and syntax tables.
+* Parser-based Font Lock:: Font Lock based on external parser.
+@end menu
+
+@node Traditional Font Lock
+@subsection Traditional Font Lock
+@cindex traditional font-lock
+
+  ``Traditional'' methods of providing font-lock information are based
+on regular-expression search and on syntactic analysis using syntax
+tables built into Emacs.  This subsection describes the use and
+customization of font-lock for major modes which use these traditional
+methods.
+
 @vindex font-lock-maximum-decoration
-  You can customize the variable @code{font-lock-maximum-decoration}
-to alter the amount of fontification applied by Font Lock mode, for
-major modes that support this feature.  The value should be a number
-(with 1 representing a minimal amount of fontification; some modes
-support levels as high as 3); or @code{t}, meaning ``as high as
-possible'' (the default).  To be effective for a given file buffer,
-the customization of @code{font-lock-maximum-decoration} should be
-done @emph{before} the file is visited; if you already have the file
-visited in a buffer when you customize this variable, kill the buffer
-and visit the file again after the customization.
+  You can control the amount of fontification applied by Font Lock
+mode by customizing the variable @code{font-lock-maximum-decoration},
+for major modes that support this feature.  The value of this variable
+should be a number (with 1 representing a minimal amount of
+fontification; some modes support levels as high as 3); or @code{t},
+meaning ``as high as possible'' (the default).  To be effective for a
+given file buffer, the customization of
+@code{font-lock-maximum-decoration} should be done @emph{before} the
+file is visited; if you already have the file visited in a buffer when
+you customize this variable, kill the buffer and visit the file again
+after the customization.
 
 You can also specify different numbers for particular major modes; for
 example, to use level 1 for C/C++ modes, and the default level
@@ -1082,16 +1130,59 @@ keywords by customizing the @code{font-lock-ignore} 
option,
 @pxref{Customizing Keywords,,, elisp, The Emacs Lisp Reference
 Manual}.
 
-@cindex just-in-time (JIT) font-lock
-@cindex background syntax highlighting
-  Fontifying large buffers can take a long time.  To avoid large
-delays when a file is visited, Emacs initially fontifies only the
-visible portion of a buffer.  As you scroll through the buffer, each
-portion that becomes visible is fontified as soon as it is displayed;
-this type of Font Lock is called @dfn{Just-In-Time} (or @dfn{JIT})
-Lock.  You can control how JIT Lock behaves, including telling it to
-perform fontification while idle, by customizing variables in the
-customization group @samp{jit-lock}.  @xref{Specific Customization}.
+@node Parser-based Font Lock
+@subsection Parser-based Font Lock
+@cindex font-lock via tree-sitter
+@cindex parser-based font-lock
+  If your Emacs was built with the tree-sitter library, it can use the
+results of parsing the buffer text by that library for the purposes of
+fontification.  This is usually faster and more accurate than the
+``traditional'' methods described in the previous subsection, since
+the tree-sitter library provides full-blown parsers for programming
+languages and other kinds of formatted text which it supports.  Major
+modes which utilize the tree-sitter library are named
+@code{@var{foo}-ts-mode}, with the @samp{-ts-} part indicating the use
+of the library.  This subsection documents the Font Lock support based
+on the tree-sitter library.
+
+@vindex treesit-font-lock-level
+  You can control the amount of fontification applied by Font Lock
+mode of major modes based on tree-sitter by customizing the variable
+@code{treesit-font-lock-level}.  Its value is a number between 1 and
+4:
+
+@table @asis
+@item Level 1
+This level usually fontifies only comments and function names in
+function definitions.
+@item Level 2
+This level adds fontification of keywords, strings, and data types.
+@item Level 3
+This is the default level; it adds fontification of assignments,
+numbers, properties, etc.
+@item Level 4
+This level adds everything else that can be fontified: operators,
+delimiters, brackets, other punctuation, function names in function
+calls, variables, etc.
+@end table
+
+@vindex treesit-font-lock-feature-list
+@noindent
+What exactly constitutes each of the syntactical categories mentioned
+above depends on the major mode and the parser grammar used by
+tree-sitter for the major-mode's language.  However, in general the
+categories follow the conventions of the programming language or the
+file format supported by the major mode.  The buffer-local value of
+the variable @code{treesit-font-lock-feature-list} holds the
+fontification features supported by a tree-sitter based major mode,
+where each sub-list shows the features provided by the corresponding
+fontification level.
+
+  Once you change the value of @code{treesit-font-lock-level} via
+@w{@kbd{M-x customize-variable}} (@pxref{Specific Customization}), it
+takes effect immediately in all the existing buffers and for files you
+visit in the future in the same session.
+
 
 @node Highlight Interactively
 @section Interactive Highlighting
diff --git a/doc/emacs/emacs.texi b/doc/emacs/emacs.texi
index b6d149eb3ef..7071ea44edd 100644
--- a/doc/emacs/emacs.texi
+++ b/doc/emacs/emacs.texi
@@ -383,6 +383,10 @@ Controlling the Display
 * Visual Line Mode::       Word wrap and screen line-based editing.
 * Display Custom::         Information on variables for customizing display.
 
+Font Lock
+* Traditional Font Lock::  Font Lock based on regexps and syntax tables.
+* Parser-based Font Lock:: Font Lock based on external parser.
+
 Searching and Replacement
 
 * Incremental Search::     Search happens as you type the string.
diff --git a/doc/emacs/files.texi b/doc/emacs/files.texi
index 6d666831612..c0e702da947 100644
--- a/doc/emacs/files.texi
+++ b/doc/emacs/files.texi
@@ -215,6 +215,17 @@ by the integers that Emacs can represent 
(@pxref{Buffers}).  If you
 try, Emacs displays an error message saying that the maximum buffer
 size has been exceeded.
 
+@vindex treesit-max-buffer-size
+  If you try to visit a file whose major mode (@pxref{Major Modes})
+uses the tree-sitter parsing library, Emacs will display a warning if
+the file's size in bytes is larger than the value of the variable
+@code{treesit-max-buffer-size}.  The default value is 40 megabytes for
+64-bit Emacs and 15 megabytes for 32-bit Emacs.  This avoids the
+danger of having Emacs run out of memory by preventing the activation
+of major modes based on tree-sitter in such large buffers, because a
+typical tree-sitter parser needs about 10 times as much memory as the
+text it parses.
+
 @cindex wildcard characters in file names
 @vindex find-file-wildcards
   If the file name you specify contains shell-style wildcard
diff --git a/doc/emacs/programs.texi b/doc/emacs/programs.texi
index 4aac150934b..e9268ff2a0d 100644
--- a/doc/emacs/programs.texi
+++ b/doc/emacs/programs.texi
@@ -254,6 +254,17 @@ they do their standard jobs in a way better fitting a 
particular
 language.  Other major modes may replace any or all of these key
 bindings for that purpose.
 
+@cindex nested defuns
+@vindex treesit-defun-tactic
+  Some programming languages supported @dfn{nested defuns}, whereby a
+defun (such as a function or a method or a class) can be defined
+inside (i.e., as part of the body) of another defun.  The commands
+described above by default find the beginning and the end of the
+@emph{innermost} defun around point.  Major modes based on the
+tree-sitter library provide control of this behavior: if the variable
+@code{treesit-defun-tactic} is set to the value @code{top-level}, the
+defun commands will find the @emph{outermost} defuns instead.
+
 @node Imenu
 @subsection Imenu
 @cindex index of buffer definitions
@@ -520,15 +531,19 @@ then indent it like this:
 @item C-c C-q
 @kindex C-c C-q @r{(C mode)}
 @findex c-indent-defun
+@findex c-ts-mode-indent-defun
 Reindent the current top-level function definition or aggregate type
-declaration (@code{c-indent-defun}).
+declaration (@code{c-indent-defun} in CC mode,
+@code{c-ts-mode-indent-defun} in @code{c-ts-mode} based on tree-sitter).
 
 @item C-M-q
 @kindex C-M-q @r{(C mode)}
 @findex c-indent-exp
-Reindent each line in the balanced expression that follows point
-(@code{c-indent-exp}).  A prefix argument inhibits warning messages
-about invalid syntax.
+@findex prog-indent-sexp
+Reindent each line in the balanced expression that follows point.  In
+CC mode, this invokes @code{c-indent-exp}; in tree-sitter based
+@code{c-ts-mode} this invokes a more general @code{prog-indent-sexp}.
+A prefix argument inhibits warning messages about invalid syntax.
 
 @item @key{TAB}
 @findex c-indent-line-or-region
@@ -568,7 +583,8 @@ onto the indentation of the @dfn{anchor statement}.
 
 @table @kbd
 @item C-c . @var{style} @key{RET}
-Select a predefined style @var{style} (@code{c-set-style}).
+Select a predefined style @var{style} (@code{c-set-style} in CC mode,
+@code{c-ts-mode-set-style} in @code{c-ts-mode} based on tree-sitter).
 @end table
 
   A @dfn{style} is a named collection of customizations that can be
@@ -584,6 +600,7 @@ typing @kbd{C-M-q} at the start of a function definition.
 
 @kindex C-c . @r{(C mode)}
 @findex c-set-style
+@findex c-ts-mode-set-style
   To choose a style for the current buffer, use the command @w{@kbd{C-c
 .}}.  Specify a style name as an argument (case is not significant).
 This command affects the current buffer only, and it affects only
@@ -592,11 +609,11 @@ the code already in the buffer.  To reindent the whole 
buffer in the
 new style, you can type @kbd{C-x h C-M-\}.
 
 @vindex c-default-style
-  You can also set the variable @code{c-default-style} to specify the
-default style for various major modes.  Its value should be either the
-style's name (a string) or an alist, in which each element specifies
-one major mode and which indentation style to use for it.  For
-example,
+  When using CC mode, you can also set the variable
+@code{c-default-style} to specify the default style for various major
+modes.  Its value should be either the style's name (a string) or an
+alist, in which each element specifies one major mode and which
+indentation style to use for it.  For example,
 
 @example
 (setq c-default-style
@@ -613,6 +630,11 @@ one of the C-like major modes; thus, if you specify a new 
default
 style for Java mode, you can make it take effect in an existing Java
 mode buffer by typing @kbd{M-x java-mode} there.
 
+@vindex c-ts-mode-indent-style
+  When using the tree-sitter based @code{c-ts-mode}, you can set the
+default indentation style by customizing the variable
+@code{c-ts-mode-indent-style}.
+
   The @code{gnu} style specifies the formatting recommended by the GNU
 Project for C; it is the default, so as to encourage use of our
 recommended style.
diff --git a/lisp/progmodes/c-ts-mode.el b/lisp/progmodes/c-ts-mode.el
index b2f92b93193..612c41bf073 100644
--- a/lisp/progmodes/c-ts-mode.el
+++ b/lisp/progmodes/c-ts-mode.el
@@ -700,7 +700,8 @@ the semicolon.  This function skips the semicolon."
 (defvar-keymap c-ts-mode-map
   :doc "Keymap for the C language with tree-sitter"
   :parent prog-mode-map
-  "C-c C-q" #'c-ts-mode-indent-defun)
+  "C-c C-q" #'c-ts-mode-indent-defun
+  "C-c ." #'c-ts-mode-set-style)
 
 ;;;###autoload
 (define-derived-mode c-ts-base-mode prog-mode "C"
diff --git a/lisp/treesit.el b/lisp/treesit.el
index 5fb6a2eef6e..92833fb007c 100644
--- a/lisp/treesit.el
+++ b/lisp/treesit.el
@@ -580,16 +580,21 @@ from 1 which is the absolute minimum, to 4 that yields 
the maximum
 fontifications.
 
 Level 1 usually contains only comments and definitions.
-Level 2 usually adds keywords, strings, constants, types, etc.
-Level 3 usually represents a full-blown fontification, including
-assignment, constants, numbers, properties, etc.
+Level 2 usually adds keywords, strings, data types, etc.
+Level 3 usually represents full-blown fontifications, including
+assignments, constants, numbers and literals, properties, etc.
 Level 4 adds everything else that can be fontified: delimiters,
-operators, brackets, all functions and variables, etc.
+operators, brackets, punctuation, all functions and variables, etc.
 
 In addition to the decoration level, individual features can be
 turned on/off by calling `treesit-font-lock-recompute-features'.
 Changing the decoration level requires calling
-`treesit-font-lock-recompute-features' to have an effect."
+`treesit-font-lock-recompute-features' to have an effect, unless
+done via `customize-variable'.
+
+To see which syntactical categories are fontified by each level
+in a particular major mode, examine the buffer-local value of the
+variable `treesit-font-lock-feature-list'."
   :type 'integer
   :set #'treesit--font-lock-level-setter
   :version "29.1")



reply via email to

[Prev in Thread] Current Thread [Next in Thread]