[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Feature request/RFC: proper highlighting of code embedded in comments
From: |
Clément Pit--Claudel |
Subject: |
Feature request/RFC: proper highlighting of code embedded in comments |
Date: |
Sat, 15 Oct 2016 11:19:24 -0400 |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.3.0 |
Hi emacs-devel,
Some languages have a way to quote code in comments. Some examples:
* Python
def example(foo, *bars):
"""Foo some bars"""
>>> example(1,
... 2,
... 3)
3
>>> example(4, 8)
67
"""
* Coq
Definition example foo bars :=
(* [example foo bars] uses [foo] to foo some [bars]. For example:
<<
Compute (example 1 [2, 3]).
(* 3 *)
>> *)
In Python, ‘>>>’ indicates a doctest (a small bit of example code). In Coq,
‘[…]’ and ‘<<…>>’ serve as markers (inside of comments) of single-line (resp
multi-line) code snippets. At the moment, Emacs doesn't highlight these
snippets. I originally asked about this in
http://emacs.stackexchange.com/questions/19998/code-blocks-in-font-lock-comments
, but received no answers.
There are multiple currently-available workarounds, but none of them that I
know of are satisfactory:
* Duplicate all font-lock rules, creating anchored matchers that recognize code
in comments. The duplication is very unpleasant, and it will require adding
‘prepend’ to a bunch of font-lock rules, which will break some of them.
* Use a custom syntax-propertize-function to recognize these code snippets and
escape out of strings. This has some potential, but it confuses existing
tools. For example, in Python, one can do the following; it works fine for
‘>>>’ in comments, but in strings it seems to break eldoc, among others:
syntax-ppss()
python-util-forward-comment(1)
python-nav-end-of-defun()
python-info-current-defun()
(let ((current-defun (python-info-current-defun))) (if current-defun (progn
(format "In: %s()" current-defun))))
(defconst litpy--doctest-re
"^#*\\s-*\\(>>>\\|\\.\\.\\.\\)\\s-*\\(.+\\)$"
"Regexp matching doctests.")
(defun litpy--syntax-propertize-function (start end)
"Mark doctests in START..END."
(goto-char start)
(while (re-search-forward litpy--doctest-re end t)
(let* ((old-syntax (save-excursion (syntax-ppss (match-beginning 1))))
(in-docstring-p (eq (nth 3 old-syntax) t))
(in-comment-p (eq (nth 4 old-syntax) t))
(closing-syntax (cond (in-docstring-p "|") (in-comment-p ">")))
(reopening-syntax (cond (in-docstring-p "|") (in-comment-p "<")))
(reopening-char (char-after (match-end 2)))
(no-reopen (eq (and reopening-char (char-syntax reopening-char))
(cond (in-comment-p ?>)))))
(when closing-syntax
(put-text-property (1- (match-end 1)) (match-end 1)
'syntax-table (string-to-syntax closing-syntax))
(when (and reopening-char (not no-reopen))
(put-text-property (match-end 2) (1+ (match-end 2))
'syntax-table (string-to-syntax
reopening-syntax)))))))
Maybe the second approach can be made to more-or-less work for Python, despite
the issue above — I'm not entirely sure. The idea there is to detect chunks of
code, and mark their starting and ending characters in a way that escapes from
the surrounding comment or string.
But this doesn't solve the problem for Coq, for example, because it confuses
comment-forward and the like. Some coq tools depend on Emacs to identify
comments and skip over them when running a file (code is sent bit by bit, so if
‘(* foo [some code here] bar *)’ is annotated with syntax properties to make
Emacs think that it should be understood as ‘(* foo *) some code here (* bar
*)’, then Proof General (a Coq IDE based on Emacs) won't realize that “some
code here” is part of a comment, and things will break.
I'm not sure what the right approach is. I guess there are two approaches:
* Mark embedded code in comments as actual code using
syntax-propertize-function, and add a way for tools to detect this "code but
not really code" situation. Pros: things like company, eldoc,
prettify-symbols-mode, etc. will work in embedded code comments without having
to opt them in. Cons: some things will break, and will need to be fixed
(comment-forward, Proof General, Elpy, indentation functions…).
* Add new "code block starter"/"code-block-ender" syntax classes? Then
font-lock would know that it has to highlight these. Pros: few things would
break. Cons: Tools would have to be opted-in (company-mode, eldoc,
prettify-symbols-mode, …).
Am I missing another obvious solution? Has this topic been discussed before?
Cheers,
Clément.