emacs-diffs
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Emacs-diffs] master 07367e5 2/2: Add rx extension mechanism


From: Paul Eggert
Subject: [Emacs-diffs] master 07367e5 2/2: Add rx extension mechanism
Date: Wed, 25 Sep 2019 17:29:56 -0400 (EDT)

branch: master
commit 07367e5b95fe31f3d4e994b42b081075501b9b60
Author: Mattias Engdegård <address@hidden>
Commit: Paul Eggert <address@hidden>

    Add rx extension mechanism
    
    Add a built-in set of extension macros: `rx-define', `rx-let' and
    `rx-let-eval'.
    
    * lisp/emacs-lisp/rx.el (rx-constituents, rx-to-string): Doc updates.
    (rx--builtin-symbols, rx--builtin-names, rx--local-definitions)
    (rx--lookup-def, rx--substitute, rx--expand-template)
    (rx--make-binding, rx--make-named-binding, rx--extend-local-defs)
    (rx-let-eval, rx-let, rx-define): New.
    (rx--translate-symbol, rx--translate-form): Use extensions if any.
    (rx): Use local definitions.
    * test/lisp/emacs-lisp/rx-tests.el (rx-let, rx-define)
    (rx-to-string-define, rx-let-define, rx-let-eval): New.
    * etc/NEWS (Changes in Specialized Modes and Packages):
    * doc/lispref/searching.texi (Rx Notation, Rx Functions, Extending Rx):
    Add node about rx extensions.
---
 doc/lispref/searching.texi       | 157 ++++++++++++++++++++
 etc/NEWS                         |   4 +
 lisp/emacs-lisp/rx.el            | 299 ++++++++++++++++++++++++++++++++++++---
 test/lisp/emacs-lisp/rx-tests.el |  98 +++++++++++++
 4 files changed, 538 insertions(+), 20 deletions(-)

diff --git a/doc/lispref/searching.texi b/doc/lispref/searching.texi
index 2d94e56..a4b6533 100644
--- a/doc/lispref/searching.texi
+++ b/doc/lispref/searching.texi
@@ -1037,6 +1037,7 @@ customisation.
 @menu
 * Rx Constructs::       Constructs valid in rx forms.
 * Rx Functions::        Functions and macros that use rx forms.
+* Extending Rx::        How to define your own rx forms.
 @end menu
 
 @node Rx Constructs
@@ -1524,6 +1525,162 @@ must be string literals.
 
 The @code{pcase} macro can use @code{rx} expressions as patterns
 directly; @pxref{rx in pcase}.
+
+For mechanisms to add user-defined extensions to the @code{rx}
+notation, @pxref{Extending Rx}.
+
+@node Extending Rx
+@subsubsection Defining new @code{rx} forms
+
+The @code{rx} notation can be extended by defining new symbols and
+parametrised forms in terms of other @code{rx} expressions.  This is
+handy for sharing parts between several regexps, and for making
+complex ones easier to build and understand by putting them together
+from smaller pieces.
+
+For example, you could define @code{name} to mean
+@code{(one-or-more letter)}, and @code{(quoted @var{x})} to mean
+@code{(seq ?' @var{x} ?')} for any @var{x}.  These forms could then be
+used in @code{rx} expressions like any other: @code{(rx (quoted name))}
+would match a nonempty sequence of letters inside single quotes.
+
+The Lisp macros below provide different ways of binding names to
+definitions.  Common to all of them are the following rules:
+
+@itemize
+@item
+Built-in @code{rx} forms, like @code{digit} and @code{group}, cannot
+be redefined.
+
+@item
+The definitions live in a name space of their own, separate from that
+of Lisp variables.  There is thus no need to attach a suffix like
+@code{-regexp} to names; they cannot collide with anything else.
+
+@item
+Definitions cannot refer to themselves recursively, directly or
+indirectly.  If you find yourself needing this, you want a parser, not
+a regular expression.
+
+@item
+Definitions are only ever expanded in calls to @code{rx} or
+@code{rx-to-string}, not merely by their presence in definition
+macros.  This means that the order of definitions doesn't matter, even
+when they refer to each other, and that syntax errors only show up
+when they are used, not when they are defined.
+
+@item
+User-defined forms are allowed wherever arbitrary @code{rx}
+expressions are expected; for example, in the body of a
+@code{zero-or-one} form, but not inside @code{any} or @code{category}
+forms.
+@end itemize
+
+@defmac rx-define name [arglist] rx-form
+Define @var{name} globally in all subsequent calls to @code{rx} and
+@code{rx-to-string}.  If @var{arglist} is absent, then @var{name} is
+defined as a plain symbol to be replaced with @var{rx-form}.  Example:
+
+@example
+@group
+(rx-define haskell-comment (seq "--" (zero-or-more nonl)))
+(rx haskell-comment)
+     @result{} "--.*"
+@end group
+@end example
+
+If @var{arglist} is present, it must be a list of zero or more
+argument names, and @var{name} is then defined as a parametrised form.
+When used in an @code{rx} expression as @code{(@var{name} @var{arg}@dots{})},
+each @var{arg} will replace the corresponding argument name inside
+@var{rx-form}.
+
+@var{arglist} may end in @code{&rest} and one final argument name,
+denoting a rest parameter.  The rest parameter will expand to all
+extra actual argument values not matched by any other parameter in
+@var{arglist}, spliced into @var{rx-form} where it occurs.  Example:
+
+@example
+@group
+(rx-define moan (x y &rest r) (seq x (one-or-more y) r "!"))
+(rx (moan "MOO" "A" "MEE" "OW"))
+     @result{} "MOOA+MEEOW!"
+@end group
+@end example
+
+Since the definition is global, it is recommended to give @var{name} a
+package prefix to avoid name clashes with definitions elsewhere, as is
+usual when naming non-local variables and functions.
+@end defmac
+
+@defmac rx-let (bindings@dots{}) body@dots{}
+Make the @code{rx} definitions in @var{bindings} available locally for
+@code{rx} macro invocations in @var{body}, which is then evaluated.
+
+Each element of @var{bindings} is on the form
+@w{@code{(@var{name} [@var{arglist}] @var{rx-form})}}, where the parts
+have the same meaning as in @code{rx-define} above.  Example:
+
+@example
+@group
+(rx-let ((comma-separated (item) (seq item (0+ "," item)))
+         (number (1+ digit))
+         (numbers (comma-separated number)))
+  (re-search-forward (rx "(" numbers ")")))
+@end group
+@end example
+
+The definitions are only available during the macro-expansion of
+@var{body}, and are thus not present during execution of compiled
+code.
+
+@code{rx-let} can be used not only inside a function, but also at top
+level to include global variable and function definitions that need
+to share a common set of @code{rx} forms.  Since the names are local
+inside @var{body}, there is no need for any package prefixes.
+Example:
+
+@example
+@group
+(rx-let ((phone-number (seq (opt ?+) (1+ (any digit ?-)))))
+  (defun find-next-phone-number ()
+    (re-search-forward (rx phone-number)))
+  (defun phone-number-p (string)
+    (string-match-p (rx bos phone-number eos) string)))
+@end group
+@end example
+
+The scope of the @code{rx-let} bindings is lexical, which means that
+they are not visible outside @var{body} itself, even in functions
+called from @var{body}.
+@end defmac
+
+@defmac rx-let-eval bindings body@dots{}
+Evaluate @var{bindings} to a list of bindings as in @code{rx-let},
+and evaluate @var{body} with those bindings in effect for calls
+to @code{rx-to-string}.
+
+This macro is similar to @code{rx-let}, except that the @var{bindings}
+argument is evaluated (and thus needs to be quoted if it is a list
+literal), and the definitions are substituted at run time, which is
+required for @code{rx-to-string} to work.  Example:
+
+@example
+@group
+(rx-let-eval
+    '((ponder (x) (seq "Where have all the " x " gone?")))
+  (looking-at (rx-to-string
+               '(ponder (or "flowers" "young girls"
+                            "left socks")))))
+@end group
+@end example
+
+Another difference from @code{rx-let} is that the @var{bindings} are
+dynamically scoped, and thus also available in functions called from
+@var{body}. However, they are not visible inside functions defined in
+@var{body}.
+@end defmac
+
 @end ifnottex
 
 @node Regexp Functions
diff --git a/etc/NEWS b/etc/NEWS
index 96b2cb1..9a0b633 100644
--- a/etc/NEWS
+++ b/etc/NEWS
@@ -1725,6 +1725,10 @@ This also works for their aliases: '|' for 'or'; ':', 
'and' and
 In this case, 'rx' will generate code which produces a regexp string
 at run time, instead of a constant string.
 
+---
+*** New rx extension mechanism: 'rx-define', 'rx-let', 'rx-let-eval'.
+These macros add new forms to the rx notation.
+
 ** Frames
 
 +++
diff --git a/lisp/emacs-lisp/rx.el b/lisp/emacs-lisp/rx.el
index 9b3419e..a192ed1 100644
--- a/lisp/emacs-lisp/rx.el
+++ b/lisp/emacs-lisp/rx.el
@@ -97,6 +97,7 @@ Most of the names are from SRE.")
 
 (defvar rx-constituents nil
   "Alist of old-style rx extensions, for compatibility.
+For new code, use `rx-define', `rx-let' or `rx-let-eval'.
 
 Each element is (SYMBOL . DEF).
 
@@ -113,6 +114,17 @@ If DEF is a list on the form (FUN MIN-ARGS MAX-ARGS PRED), 
then
    If PRED is non-nil, it is a predicate that all actual arguments must
    satisfy.")
 
+(defvar rx--local-definitions nil
+  "Alist of dynamic local rx definitions.
+Each entry is:
+ (NAME DEF)      -- NAME is an rx symbol defined as the rx form DEF.
+ (NAME ARGS DEF) -- NAME is an rx form with arglist ARGS, defined
+                    as the rx form DEF (which can contain members of ARGS).")
+
+(defsubst rx--lookup-def (name)
+  (or (cdr (assq name rx--local-definitions))
+      (get name 'rx-definition)))
+
 ;; TODO: Additions to consider:
 ;; - A better name for `anything', like `any-char' or `anychar'.
 ;; - A name for (or), maybe `unmatchable'.
@@ -144,6 +156,12 @@ If DEF is a list on the form (FUN MIN-ARGS MAX-ARGS PRED), 
then
       ((let ((class (cdr (assq sym rx--char-classes))))
          (and class (cons (list (concat "[[:" (symbol-name class) ":]]")) t))))
 
+      ((let ((definition (rx--lookup-def sym)))
+         (and definition
+              (if (cdr definition)
+                  (error "Not an `rx' symbol definition: %s" sym)
+                (rx--translate (nth 0 definition))))))
+
       ;; For compatibility with old rx.
       ((let ((entry (assq sym rx-constituents)))
          (and (progn
@@ -310,6 +328,19 @@ INTERVALS is a list of (START . END) with START ≤ END, 
sorted by START."
         (setq tail d)))
     intervals))
 
+;; FIXME: Consider expanding definitions inside (any ...) and (not ...),
+;; and perhaps allow (any ...) inside (any ...).
+;; It would be benefit composability (build a character alternative by pieces)
+;; and be handy for obtaining the complement of a defined set of
+;; characters.  (See, for example, python.el:421, `not-simple-operator'.)
+;; (Expansion in other non-rx positions is probably not a good idea:
+;; syntax, category, backref, and the integer parameters of group-n,
+;; =, >=, **, repeat)
+;; Similar effect could be attained by ensuring that
+;; (or (any X) (any Y)) -> (any X Y), and find a way to compose negative
+;; sets.  `and' is taken, but we could add
+;; (intersection (not (any X)) (not (any Y))) -> (not (any X Y)).
+
 (defun rx--translate-any (negated body)
   "Translate an (any ...) construct.  Return (REGEXP . PRECEDENCE).
 If NEGATED, negate the sense."
@@ -712,6 +743,94 @@ DEF is the definition tuple.  Return (REGEXP . 
PRECEDENCE)."
         (error "The `%s' form did not expand to a string" (car form)))
       (cons (list regexp) nil))))
 
+(defun rx--substitute (bindings form)
+  "Substitute BINDINGS in FORM.  BINDINGS is an alist of (NAME . VALUES)
+where VALUES is a list to splice into FORM wherever NAME occurs.
+Return the substitution result wrapped in a list, since a single value
+can expand to any number of values."
+  (cond ((symbolp form)
+         (let ((binding (assq form bindings)))
+           (if binding
+               (cdr binding)
+             (list form))))
+        ((consp form)
+         (if (listp (cdr form))
+             ;; Proper list.  We substitute variables even in the head
+             ;; position -- who knows, might be handy one day.
+             (list (mapcan (lambda (x) (copy-sequence
+                                        (rx--substitute bindings x)))
+                           form))
+           ;; Cons pair (presumably an interval).
+           (let ((first (rx--substitute bindings (car form)))
+                 (second (rx--substitute bindings (cdr form))))
+             (if (and first (not (cdr first))
+                      second (not (cdr second)))
+                 (list (cons (car first) (car second)))
+               (error
+                "Cannot substitute a &rest parameter into a dotted pair")))))
+        (t (list form))))
+
+;; FIXME: Consider adding extensions in Lisp macro style, where
+;; arguments are passed unevaluated to code that returns the rx form
+;; to use.  Example:
+;;
+;;   (rx-let ((radix-digit (radix)
+;;             :lisp (list 'any (cons ?0 (+ ?0 (eval radix) -1)))))
+;;     (rx (radix-digit (+ 5 3))))
+;; =>
+;;   "[0-7]"
+;;
+;; While this would permit more powerful extensions, it's unclear just
+;; how often they would be used in practice.  Let's wait until there is
+;; demand for it.
+
+;; FIXME: An alternative binding syntax would be
+;;
+;;   (NAME RXs...)
+;; and
+;;   ((NAME ARGS...) RXs...)
+;;
+;; which would have two minor advantages: multiple RXs with implicit
+;; `seq' in the definition, and the arglist is no longer an optional
+;; element in the middle of the list.  On the other hand, it's less
+;; like traditional lisp arglist constructs (defun, defmacro).
+;; Since it's a Scheme-like syntax, &rest parameters could be done using
+;; dotted lists:
+;;  (rx-let (((name arg1 arg2 . rest) ...definition...)) ...)
+
+(defun rx--expand-template (op values arglist template)
+  "Return TEMPLATE with variables in ARGLIST replaced with VALUES."
+  (let ((bindings nil)
+        (value-tail values)
+        (formals arglist))
+    (while formals
+      (pcase (car formals)
+        ('&rest
+         (unless (cdr formals)
+           (error
+            "Expanding rx def `%s': missing &rest parameter name" op))
+         (push (cons (cadr formals) value-tail) bindings)
+         (setq formals nil)
+         (setq value-tail nil))
+        (name
+         (unless value-tail
+           (error
+            "Expanding rx def `%s': too few arguments (got %d, need %s%d)"
+            op (length values)
+            (if (memq '&rest arglist) "at least " "")
+            (- (length arglist) (length (memq '&rest arglist)))))
+         (push (cons name (list (car value-tail))) bindings)
+         (setq value-tail (cdr value-tail))))
+      (setq formals (cdr formals)))
+    (when value-tail
+      (error
+       "Expanding rx def `%s': too many arguments (got %d, need %d)"
+       op (length values) (length arglist)))
+    (let ((subst (rx--substitute bindings template)))
+      (if (and subst (not (cdr subst)))
+          (car subst)
+        (error "Expanding rx def `%s': must result in a single value" op)))))
+
 (defun rx--translate-form (form)
   "Translate an rx form (list structure).  Return (REGEXP . PRECEDENCE)."
   (let ((body (cdr form)))
@@ -757,24 +876,29 @@ DEF is the definition tuple.  Return (REGEXP . 
PRECEDENCE)."
       (op
        (unless (symbolp op)
          (error "Bad rx operator `%S'" op))
+       (let ((definition (rx--lookup-def op)))
+         (if definition
+             (if (cdr definition)
+                 (rx--translate
+                  (rx--expand-template
+                   op body (nth 0 definition) (nth 1 definition)))
+               (error "Not an `rx' form definition: %s" op))
+
+           ;; For compatibility with old rx.
+           (let ((entry (assq op rx-constituents)))
+             (if (progn
+                   (while (and entry (not (consp (cdr entry))))
+                     (setq entry
+                           (if (symbolp (cdr entry))
+                               ;; Alias for another entry.
+                               (assq (cdr entry) rx-constituents)
+                             ;; Wrong type, try further down the list.
+                             (assq (car entry)
+                                   (cdr (memq entry rx-constituents))))))
+                   entry)
+                 (rx--translate-compat-form (cdr entry) form)
+               (error "Unknown rx form `%s'" op)))))))))
 
-       ;; For compatibility with old rx.
-       (let ((entry (assq op rx-constituents)))
-         (if (progn
-               (while (and entry (not (consp (cdr entry))))
-                 (setq entry
-                       (if (symbolp (cdr entry))
-                           ;; Alias for another entry.
-                           (assq (cdr entry) rx-constituents)
-                         ;; Wrong type, try further down the list.
-                         (assq (car entry)
-                               (cdr (memq entry rx-constituents))))))
-               entry)
-             (rx--translate-compat-form (cdr entry) form)
-           (error "Unknown rx form `%s'" op)))))))
-
-;; Defined here rather than in re-builder to lower the odds that it
-;; will be kept in sync with changes.
 (defconst rx--builtin-forms
   '(seq sequence : and or | any in char not-char not
     repeat = >= **
@@ -786,7 +910,21 @@ DEF is the definition tuple.  Return (REGEXP . 
PRECEDENCE)."
     group submatch group-n submatch-n backref
     syntax not-syntax category
     literal eval regexp regex)
-  "List of built-in rx forms.  For use in re-builder only.")
+  "List of built-in rx function-like symbols.")
+
+(defconst rx--builtin-symbols
+  (append '(nonl not-newline any anything
+            bol eol line-start line-end
+            bos eos string-start string-end
+            bow eow word-start word-end
+            symbol-start symbol-end
+            point word-boundary not-word-boundary not-wordchar)
+          (mapcar #'car rx--char-classes))
+  "List of built-in rx variable-like symbols.")
+
+(defconst rx--builtin-names
+  (append rx--builtin-forms rx--builtin-symbols)
+  "List of built-in rx names.  These cannot be redefined by the user.")
 
 (defun rx--translate (item)
   "Translate the rx-expression ITEM.  Return (REGEXP . PRECEDENCE)."
@@ -810,7 +948,9 @@ DEF is the definition tuple.  Return (REGEXP . PRECEDENCE)."
 The arguments to `literal' and `regexp' forms inside FORM must be
 constant strings.
 If NO-GROUP is non-nil, don't bracket the result in a non-capturing
-group."
+group.
+
+For extending the `rx' notation in FORM, use `rx-define' or `rx-let-eval'."
   (let* ((item (rx--translate form))
          (exprs (if no-group
                     (car item)
@@ -939,14 +1079,133 @@ Zero-width assertions: these all match the empty string 
in specific places.
 (regexp EXPR)  Match the string regexp from evaluating EXPR at run time.
 (eval EXPR)    Match the rx sexp from evaluating EXPR at compile time.
 
+Additional constructs can be defined using `rx-define' and `rx-let',
+which see.
+
 \(fn REGEXPS...)"
-  (rx--to-expr (cons 'seq regexps)))
+  ;; Retrieve local definitions from the macroexpansion environment.
+  ;; (It's unclear whether the previous value of `rx--local-definitions'
+  ;; should be included, and if so, in which order.)
+  (let ((rx--local-definitions
+         (cdr (assq :rx-locals macroexpand-all-environment))))
+    (rx--to-expr (cons 'seq regexps))))
+
+(defun rx--make-binding (name tail)
+  "Make a definitions entry out of TAIL.
+TAIL is on the form ([ARGLIST] DEFINITION)."
+  (unless (symbolp name)
+    (error "Bad `rx' definition name: %S" name))
+  ;; FIXME: Consider using a hash table or symbol property, for speed.
+  (when (memq name rx--builtin-names)
+    (error "Cannot redefine built-in rx name `%s'" name))
+  (pcase tail
+    (`(,def)
+     (list def))
+    (`(,args ,def)
+     (unless (and (listp args) (rx--every #'symbolp args))
+       (error "Bad argument list for `rx' definition %s: %S" name args))
+     (list args def))
+    (_ (error "Bad `rx' definition of %s: %S" name tail))))
+
+(defun rx--make-named-binding (bindspec)
+  "Make a definitions entry out of BINDSPEC.
+BINDSPEC is on the form (NAME [ARGLIST] DEFINITION)."
+  (unless (consp bindspec)
+    (error "Bad `rx-let' binding: %S" bindspec))
+  (cons (car bindspec)
+        (rx--make-binding (car bindspec) (cdr bindspec))))
+
+(defun rx--extend-local-defs (bindspecs)
+  (append (mapcar #'rx--make-named-binding bindspecs)
+          rx--local-definitions))
 
+;;;###autoload
+(defmacro rx-let-eval (bindings &rest body)
+  "Evaluate BODY with local BINDINGS for `rx-to-string'.
+BINDINGS, after evaluation, is a list of definitions each on the form
+(NAME [(ARGS...)] RX), in effect for calls to `rx-to-string'
+in BODY.
+
+For bindings without an ARGS list, NAME is defined as an alias
+for the `rx' expression RX.  Where ARGS is supplied, NAME is
+defined as an `rx' form with ARGS as argument list.  The
+parameters are bound from the values in the (NAME ...) form and
+are substituted in RX.  ARGS can contain `&rest' parameters,
+whose values are spliced into RX where the parameter name occurs.
+
+Any previous definitions with the same names are shadowed during
+the expansion of BODY only.
+For extensions when using the `rx' macro, use `rx-let'.
+To make global rx extensions, use `rx-define'.
+For more details, see Info node `(elisp) Extending Rx'.
+
+\(fn BINDINGS BODY...)"
+  (declare (indent 1) (debug (form body)))
+  ;; FIXME: this way, `rx--extend-local-defs' may need to be autoloaded.
+  `(let ((rx--local-definitions (rx--extend-local-defs ,bindings)))
+     ,@body))
+
+;;;###autoload
+(defmacro rx-let (bindings &rest body)
+  "Evaluate BODY with local BINDINGS for `rx'.
+BINDINGS is an unevaluated list of bindings each on the form
+(NAME [(ARGS...)] RX).
+They are bound lexically and are available in `rx' expressions in
+BODY only.
+
+For bindings without an ARGS list, NAME is defined as an alias
+for the `rx' expression RX.  Where ARGS is supplied, NAME is
+defined as an `rx' form with ARGS as argument list.  The
+parameters are bound from the values in the (NAME ...) form and
+are substituted in RX.  ARGS can contain `&rest' parameters,
+whose values are spliced into RX where the parameter name occurs.
+
+Any previous definitions with the same names are shadowed during
+the expansion of BODY only.
+For local extensions to `rx-to-string', use `rx-let-eval'.
+To make global rx extensions, use `rx-define'.
+For more details, see Info node `(elisp) Extending Rx'.
+
+\(fn BINDINGS BODY...)"
+  (declare (indent 1) (debug (sexp body)))
+  (let ((prev-locals (cdr (assq :rx-locals macroexpand-all-environment)))
+        (new-locals (mapcar #'rx--make-named-binding bindings)))
+    (macroexpand-all (cons 'progn body)
+                     (cons (cons :rx-locals (append new-locals prev-locals))
+                           macroexpand-all-environment))))
+
+;;;###autoload
+(defmacro rx-define (name &rest definition)
+  "Define NAME as a global `rx' definition.
+If the ARGS list is omitted, define NAME as an alias for the `rx'
+expression RX.
+
+If the ARGS list is supplied, define NAME as an `rx' form with
+ARGS as argument list.  The parameters are bound from the values
+in the (NAME ...) form and are substituted in RX.
+ARGS can contain `&rest' parameters, whose values are spliced
+into RX where the parameter name occurs.
+
+Any previous global definition of NAME is overwritten with the new one.
+To make local rx extensions, use `rx-let' for `rx',
+`rx-let-eval' for `rx-to-string'.
+For more details, see Info node `(elisp) Extending Rx'.
+
+\(fn NAME [(ARGS...)] RX)"
+  (declare (indent 1))
+  `(eval-and-compile
+     (put ',name 'rx-definition ',(rx--make-binding name definition))
+     ',name))
 
 ;; During `rx--pcase-transform', list of defined variables in right-to-left
 ;; order.
 (defvar rx--pcase-vars)
 
+;; FIXME: The rewriting strategy for pcase works so-so with extensions;
+;; definitions cannot expand to `let' or named `backref'.  If this ever
+;; becomes a problem, we can handle those forms in the ordinary parser,
+;; using a dynamic variable for activating the augmented forms.
+
 (defun rx--pcase-transform (rx)
   "Transform RX, an rx-expression augmented with `let' and named `backref',
 into a plain rx-expression, collecting names into `rx--pcase-vars'."
diff --git a/test/lisp/emacs-lisp/rx-tests.el b/test/lisp/emacs-lisp/rx-tests.el
index fec046d..11de477 100644
--- a/test/lisp/emacs-lisp/rx-tests.el
+++ b/test/lisp/emacs-lisp/rx-tests.el
@@ -303,6 +303,104 @@
   (should (equal (rx-to-string '(or nonl "\nx") t)
                  ".\\|\nx")))
 
+(ert-deftest rx-let ()
+  (rx-let ((beta gamma)
+           (gamma delta)
+           (delta (+ digit))
+           (epsilon (or gamma nonl)))
+    (should (equal (rx bol delta epsilon)
+                   "^[[:digit:]]+\\(?:[[:digit:]]+\\|.\\)")))
+  (rx-let ((p () point)
+           (separated (x sep) (seq x (* sep x)))
+           (comma-separated (x) (separated x ","))
+           (semi-separated (x) (separated x ";"))
+           (matrix (v) (semi-separated (comma-separated v))))
+    (should (equal (rx (p) (matrix (+ "a")) eos)
+                   "\\=a+\\(?:,a+\\)*\\(?:;a+\\(?:,a+\\)*\\)*\\'")))
+  (rx-let ((b bol)
+           (z "B")
+           (three (x) (= 3 x)))
+    (rx-let ((two (x) (seq x x))
+             (z "A")
+             (e eol))
+      (should (equal (rx b (two (three z)) e)
+                     "^A\\{3\\}A\\{3\\}$"))))
+  (rx-let ((f (a b &rest r) (seq "<" a ";" b ":" r ">")))
+    (should (equal (rx bol (f ?x ?y) ?! (f ?u ?v ?w) ?! (f ?k ?l ?m ?n) eol)
+                   "^<x;y:>!<u;v:w>!<k;l:mn>$")))
+
+  ;; Rest parameters are expanded by splicing.
+  (rx-let ((f (&rest r) (or bol r eol)))
+    (should (equal (rx (f "ab" nonl))
+                   "^\\|ab\\|.\\|$")))
+
+  ;; Substitution is done in number positions.
+  (rx-let ((stars (n) (= n ?*)))
+    (should (equal (rx (stars 4))
+                   "\\*\\{4\\}")))
+
+  ;; Substitution is done inside dotted pairs.
+  (rx-let ((f (x y z) (any x (y . z))))
+    (should (equal (rx (f ?* ?a ?t))
+                   "[*a-t]")))
+
+  ;; Substitution is done in the head position of forms.
+  (rx-let ((f (x) (x "a")))
+    (should (equal (rx (f +))
+                   "a+"))))
+
+(ert-deftest rx-define ()
+  (rx-define rx--a (seq "x" (opt "y")))
+  (should (equal (rx bol rx--a eol)
+                 "^xy?$"))
+  (rx-define rx--c (lb rb &rest stuff) (seq lb stuff rb))
+  (should (equal (rx bol (rx--c "<" ">" rx--a nonl) eol)
+                 "^<xy?.>$"))
+  (rx-define rx--b (* rx--a))
+  (should (equal (rx rx--b)
+                 "\\(?:xy?\\)*"))
+  (rx-define rx--a "z")
+  (should (equal (rx rx--b)
+                 "z*")))
+
+(defun rx--test-rx-to-string-define ()
+  ;; `rx-define' won't expand to code inside `ert-deftest' since we use
+  ;; `eval-and-compile'.  Put it into a defun as a workaround.
+  (rx-define rx--d "Q")
+  (rx-to-string '(seq bol rx--d) t))
+
+(ert-deftest rx-to-string-define ()
+  "Check that `rx-to-string' uses definitions made by `rx-define'."
+  (should (equal (rx--test-rx-to-string-define)
+                 "^Q")))
+
+(ert-deftest rx-let-define ()
+  "Test interaction between `rx-let' and `rx-define'."
+  (rx-define rx--e "one")
+  (rx-define rx--f "eins")
+  (rx-let ((rx--e "two"))
+    (should (equal (rx rx--e nonl rx--f) "two.eins"))
+    (rx-define rx--e "three")
+    (should (equal (rx rx--e) "two"))
+    (rx-define rx--f "zwei")
+    (should (equal (rx rx--f) "zwei")))
+  (should (equal (rx rx--e nonl rx--f) "three.zwei")))
+
+(ert-deftest rx-let-eval ()
+  (rx-let-eval '((a (* digit))
+                 (f (x &rest r) (seq x nonl r)))
+    (should (equal (rx-to-string '(seq a (f bow a ?b)) t)
+                   "[[:digit:]]*\\<.[[:digit:]]*b"))))
+
+(ert-deftest rx-redefine-builtin ()
+  (should-error (rx-define sequence () "x"))
+  (should-error (rx-define sequence "x"))
+  (should-error (rx-define nonl () "x"))
+  (should-error (rx-define nonl "x"))
+  (should-error (rx-let ((punctuation () "x")) nil))
+  (should-error (rx-let ((punctuation "x")) nil))
+  (should-error (rx-let-eval '((not-char () "x")) nil))
+  (should-error (rx-let-eval '((not-char "x")) nil)))
 
 (ert-deftest rx-constituents ()
   (let ((rx-constituents



reply via email to

[Prev in Thread] Current Thread [Next in Thread]