emacs-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Removing no-back-reference restriction from syntax-propertize-rules


From: Tassilo Horn
Subject: Re: Removing no-back-reference restriction from syntax-propertize-rules
Date: Mon, 18 May 2020 23:30:32 +0200
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/28.0.50 (gnu/linux)

Stefan Monnier <address@hidden> writes:

>> Can you give an example regexp where \N preceeded by a non-\ is no
>> back-reference (and still valid)?
>
> Of course: "bar\\(foo\\)[\\1-9]".

Oh, right.

>> BTW, do I read the docs right in that there are at most nine
>> back-references, i.e., \10 cannot exist?  In that case, we'd have the
>> restriction that at most 9 back-references may appear in all syntax
>> rules.
>
> Apparently, yes:
>
>     (string-match "\\(?5:[ab]\\)-\\5" "a-a")
>     0 (#o0, #x0, ?\C-@)
>     ELISP> (string-match "\\(?15:[ab]\\)-\\15" "a-a")
>     nil
>
> [ I guess that's another reason to stay away from backreferences.  ]

Ah, so my "back-refs to explicitly numbered groups don't work at all"
issue was actually that I've used a bigger number than 9.

>> I guess in that case we should signal an error, no?
>
> Indeed.

Ok, will do.

>>       (when (save-match-data
>>               ;; With \N, the \ must be in a subregexp context and the
>>               ;; N must not be in a subregexp context.
>>               (and (subregexp-context-p new-re (match-beginning 0))
>>                    (not (subregexp-context-p new-re (match-beginning 1)))))
>
> You don't need/want to test (subregexp-context-p new-re (match-beginning 1)).

Ok.

So all in all, this should give the following patch:

--8<---------------cut here---------------start------------->8---
scratch/syntax-propertize-rules-with-backrefs 
ba3eee275640d453ffee9f6d9768be1ebd73d51b
Author:     Tassilo Horn <address@hidden>
AuthorDate: Sat May 16 10:05:12 2020 +0200
Commit:     Tassilo Horn <address@hidden>
CommitDate: Mon May 18 23:14:49 2020 +0200

Parent:     ca7224d5db Add test for recent buffer-local-variables change
Merged:     emacs-27 feature/browse-url-browser-kind master 
scratch/syntax-propertize-rules-with-backrefs
Contained:  scratch/syntax-propertize-rules-with-backrefs
Follows:    emacs-27.0.91 (945)

Allow back-references in syntax-propertize-rules.

* lisp/emacs-lisp/syntax.el (syntax-propertize--shift-groups-and-backrefs):
Renamed from syntax-propertize--shift-groups, and also shift
back-references.
(syntax-propertize-rules): Adapt docstring and use renamed function.

1 file changed, 25 insertions(+), 10 deletions(-)
lisp/emacs-lisp/syntax.el | 35 +++++++++++++++++++++++++----------

modified   lisp/emacs-lisp/syntax.el
@@ -139,14 +139,28 @@ syntax-propertize-multiline
                  (point-max))))
   (cons beg end))
 
-(defun syntax-propertize--shift-groups (re n)
-  (replace-regexp-in-string
-   "\\\\(\\?\\([0-9]+\\):"
-   (lambda (s)
-     (replace-match
-      (number-to-string (+ n (string-to-number (match-string 1 s))))
-      t t s 1))
-   re t t))
+(defun syntax-propertize--shift-groups-and-backrefs (re n)
+  (let ((new-re (replace-regexp-in-string
+                 "\\\\(\\?\\([0-9]+\\):"
+                 (lambda (s)
+                   (replace-match
+                    (number-to-string
+                     (+ n (string-to-number (match-string 1 s))))
+                    t t s 1))
+                 re t t))
+        (pos 0))
+    (while (string-match "\\\\\\([0-9]+\\)" new-re pos)
+      (setq pos (+ 1 (match-beginning 1)))
+      (when (save-match-data
+              ;; With \N, the \ must be in a subregexp context, i.e.,
+              ;; not in a character class or in a \{\} repetition.
+              (subregexp-context-p new-re (match-beginning 0)))
+        (let ((shifted (+ n (string-to-number (match-string 1 new-re)))))
+          (when (> shifted 9)
+            (error "There may be at most nine back-references"))
+          (setq new-re (replace-match (number-to-string shifted)
+                                      t t new-re 1)))))
+    new-re))
 
 (defmacro syntax-propertize-precompile-rules (&rest rules)
   "Return a precompiled form of RULES to pass to `syntax-propertize-rules'.
@@ -190,7 +204,8 @@ syntax-propertize-rules
 Also SYNTAX is free to move point, in which case RULES may not be applied to
 some parts of the text or may be applied several times to other parts.
 
-Note: back-references in REGEXPs do not work."
+Note: There may be at most nine back-references in the REGEXPs of
+all RULES in total."
   (declare (debug (&rest &or symbolp    ;FIXME: edebug this eval step.
                          (form &rest
                                (numberp
@@ -219,7 +234,7 @@ syntax-propertize-rules
                  ;; tell when *this* match 0 has succeeded.
                  (cl-incf offset)
                  (setq re (concat "\\(" re "\\)")))
-               (setq re (syntax-propertize--shift-groups re offset))
+               (setq re (syntax-propertize--shift-groups-and-backrefs re 
offset))
                (let ((code '())
                      (condition
                       (cond
--8<---------------cut here---------------end--------------->8---

Seems to work fine and errors as soon as a back-reference needs to be
renumbered to \10 or more.

Good to go?

Bye,
Tassilo



reply via email to

[Prev in Thread] Current Thread [Next in Thread]