bug-gnu-emacs
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#60623: 30.0.50; Add forward-sentence with tree sitter support


From: Theodor Thornhill
Subject: bug#60623: 30.0.50; Add forward-sentence with tree sitter support
Date: Tue, 10 Jan 2023 20:33:52 +0100

Eli Zaretskii <eliz@gnu.org> writes:

>> From: Theodor Thornhill <theo@thornhill.no>
>> Cc: mardani29@yahoo.es, 60623@debbugs.gnu.org, casouri@gmail.com,
>>  monnier@iro.umontreal.ca, juri@linkov.net
>> Date: Tue, 10 Jan 2023 09:37:26 +0100
>> 
>> >> > No, because these are not really sentences in some human-readable
>> >> > language, these are program parts.  As such they should be somewhere
>> >> > under "27 Programs", possibly in "Defuns".
>> >> >
>> >> > However, "Sentences" might mention that programming modes have their
>> >> > own interpretation of "sentence" and corresponding movement commands.
>> >> 
>> >> Yeah, that makes sense.  Should I make an attempt at such formulations,
>> >> or will you do it at a later time?
>> >
>> > It is better that you try, if only to gain experience ;-)
>> 
>> How about this for starter?
>
> Very good, thank you very much.  A few comments below.
>
>> --- a/doc/emacs/programs.texi
>> +++ b/doc/emacs/programs.texi
>> @@ -163,6 +163,7 @@ Defuns
>>  * Left Margin Paren::   An open-paren or similar opening delimiter
>>                            starts a defun if it is at the left margin.
>>  * Moving by Defuns::    Commands to move over or mark a major definition.
>> +* Moving by Sentences:: Commands to move over certain definitions in code.
>                                                          ^^^^^^^^^^^
> I'd use "code units" or "units of code" here.

Done.

>
> Also, should we perhaps name the section "Moving by Statements"? or
> would it be too inaccurate?
>

I'm not sure.  I think that maybe because the commands involved, and the
ones that implicitly will be impacted, such as kill-sentence and friends
it is best to stay with Sentences?  But a statement is the better term
wrt programming languages of course.  I hold no strong opinions here.

>> +  These commands move point or set up the region based on definitions,
>> +also called @dfn{sentences}.  Even though sentences is usually
>
> Each @dfn in a manual should have an index entry, so that readers
> could easily find it.  in this case, the index entry should qualify
> the "sentences" term by the fact that we are talking about units of
> code.  So:
>
>   @cindex sentences, in programming languages
>

Done.

>> +considered when writing human languages, Emacs can use the same
>> +commands to move over certain constructs in programming languages
>> +(@pxref{Sentences}, @pxref{Moving by Defuns}).  In a programming
>> +language a sentence is usually a complete language construct smaller
>> +than defuns, but larger than sexps (@pxref{List Motion,,, elisp, The
>> +Emacs Lisp Reference Manual}).
>
> A couple of examples from two different languages could be a great
> help here.  Otherwise this text sounds a bit too abstract.
>

Something like this?

>> +@kindex M-a
>> +@kindex M-e
>
> Since we already have M-e elsewhere in the manual, I suggest to
> qualify the key bindings here:
>
>   @kindex M-a @r{(programming modes)}
>
> and similarly for M-e.  The @r{..} thingy is necessary to reset to the
> default typeface, since key index is implicitly typeset in @code.
>
>> +@findex backward-sentence
>> +@findex forward-sentence
>
> Likewise with these two @findex entries: qualify them, since we have
> the same commands documented elsewhere under "Sentences".
>

Done.

>> --- a/doc/emacs/text.texi
>> +++ b/doc/emacs/text.texi
>> @@ -253,6 +253,14 @@ Sentences
>>  of a sentence.  Set the variable @code{sentence-end-without-period} to
>>  @code{t} in such cases.
>>  
>> +  Even though the above mentioned sentence movement commands are based
>> +on human languages, other Emacs modes can set these command to get
>> +similar functionality.  What exactly a sentence is in a non-human
>> +language is dependent on the target language, but usually it is
>> +complete statements, such as a variable definition and initialization,
>> +or a conditional statement (@pxref{Moving by Sentences,,, emacs, The
>> +extensible self-documenting text editor}).
>
> The last sentence should be in "Moving by Sentences", since it
> describes the commands documented there.  Also, please add a
> cross-reference here to "Moving by Sentences", since you mention that
> in the text (and rightfully so).
>

Is something like this what you meant?

>> +@defvar treesit-sentence-type-regexp
>> +The value of this variable is a regexp matching the node type of sentence
>> +nodes.  (For ``node'' and ``node type'', @pxref{Parsing Program Source}.)
>> +
>> +@findex treesit-forward-sentence
>> +@findex forward-sentence
>> +@findex backward-sentence
>> +If Emacs is compiled with tree-sitter, it can use the tree-sitter
>> +parser information to move across syntax constructs.  Since what
>> +exactly is considered a sentence varies between languages, a major mode
>> +should set @code{treesit-sentence-type-regexp} to determine that.  Then
>> +the mode can get navigation-by-sentence functionality for free, by using
>> +@code{forward-sentence} and @code{backward-sentence}.
>
> Here please also add a cross-reference to the "Moving by Sentences"
> node in the Emacs manual, so that people could understand what kind of
> "sentences" this is talking about.
>
>> +** New defvar forward-sentence-function.
>       ^^^^^^^^^^
> "New variable"
>
>> +Emacs now can set this variable to customize the behavior of the
>> +'forward-sentence' function.
>
> Not "Emacs", but "major modes".
>
>> +** New defun forward-sentence-default-function.
>       ^^^^^^^^^
> "New function"
>
>> +The previous implementation of 'forward-sentence' is moved into its
>> +own function, to be bound by 'forward-sentence-function'.
>> +
>> +** New defvar-local 'treesit-sentence-type-regexp.
>> +Similarly to 'treesit-defun-type-regexp', this variable is used to
>> +navigate sentences in Tree-sitter enabled modes.
>> +
>> +** New function 'treesit-forward-sentence'.
>> +treesit.el now conditionally sets 'forward-sentence-function' for all
>> +Tree-sitter modes that sets 'treesit-sentence-type-regexp'.
>
> Please make these related items sub-headings of a common heading,
> something like "Commands and variables to move by program statements".
>

Done.

>> +
>>  
>>  * Changes in Specialized Modes and Packages in Emacs 30.1
>>  ---
>> diff --git a/lisp/textmodes/paragraphs.el b/lisp/textmodes/paragraphs.el
>> index 73abb155aaa..fd2d83eeebf 100644
>> --- a/lisp/textmodes/paragraphs.el
>> +++ b/lisp/textmodes/paragraphs.el
>> @@ -441,13 +441,12 @@ end-of-paragraph-text
>>        (if (< (point) (point-max))
>>            (end-of-paragraph-text))))))
>>  
>> -(defun forward-sentence (&optional arg)
>> +(defun forward-sentence-default-function (&optional arg)
>>    "Move forward to next end of sentence.  With argument, repeat.
>>  When ARG is negative, move backward repeatedly to start of sentence.
>>  
>>  The variable `sentence-end' is a regular expression that matches ends of
>>  sentences.  Also, every paragraph boundary terminates sentences as well."
>> -  (interactive "^p")
>>    (or arg (setq arg 1))
>>    (let ((opoint (point))
>>          (sentence-end (sentence-end)))
>> @@ -480,6 +479,18 @@ forward-sentence
>>      (let ((npoint (constrain-to-field nil opoint t)))
>>        (not (= npoint opoint)))))
>>  
>> +(defvar forward-sentence-function #'forward-sentence-default-function
>> +  "Function to be used to calculate sentence movements.
>> +See `forward-sentence' for a description of its behavior.")
>> +
>> +(defun forward-sentence (&optional arg)
>> +  "Move forward to next end of sentence.  With argument, repeat.
>                                              ^^^^^^^^^^^^^^^^^^^^^
> "With argument ARG, repeat."  The doc string should reference the
> arguments where possible.
>

Thanks, done.

>> +When ARG is negative, move backward repeatedly to start of sentence.
>    ^^^^
> "If", not "When".
>

Done

>> +(defvar-local treesit-sentence-type-regexp ""
>> +  "A regexp that matches the node type of sentence nodes.
>
> Why is the default an empty regexp? wouldn't nil be better?

Indeed it will, done.

How about this?

Theo

Attachment: 0001-Add-forward-sentence-with-tree-sitter-support-bug-60.patch
Description: Text Data


reply via email to

[Prev in Thread] Current Thread [Next in Thread]