emacs-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: New rx implementation with extension constructs


From: Noam Postavsky
Subject: Re: New rx implementation with extension constructs
Date: Thu, 5 Sep 2019 11:38:23 -0400

> works just as expected. &rest arguments are permitted, and expand to
> implicit (seq ...) forms.  No provision was made for macros able to
> execute arbitrary Lisp code; I just couldn't find a use for them, and
> decided to wait until someone would tell me otherwise. Thus, all
> parametrised forms work by plain substitution.

Do you mean that macros don't support (literal LISP-FORM) and (regexp
LISP-FORM)?  Or something else?

> +;; The `rx--translate...' functions below return (REGEXP . PRECEDENCE),
> +;; where REGEXP is a list of string expressions that will be
> +;; concatenated into a regexp, and PRECEDENCE is one of
> +;;
> +;;  t    -- can be used as argument to postfix operators
> +;;  seq  -- can be concatenated in sequence with other seq or higher
> +;;  lseq -- can be concatenated to the left of rseq or higher
> +;;  rseq -- can be concatenated to the right of lseq or higher
> +;;  nil  -- can only be used in alternatives
> +;;
> +;; They form a lattice:
> +;;
> +;;           t          highest precedence
> +;;           |
> +;;          seq
> +;;         /   \
> +;;      lseq   rseq
> +;;         \   /
> +;;          nil         lowest precedence

It would help to add some concrete examples (i.e., of things that
would count as `t', `seq', etc) to this abstract explanation.

> +(defun rx--translate-symbol (sym)
> +  "Translate an rx symbol.  Return (REGEXP . PRECEDENCE)."
> +  (pcase sym
> +    ((or 'nonl 'not-newline 'any) (cons (list ".") t))

Is there a reason not to use '((".") . t) here (and similar for the rest
of the alternatives)?  If yes, then it's probably worth mentioning in a
comment.

> +(defun rx--string-to-intervals (str)
> +  "Decode STR as intervals: A-Z becomes (?A . ?Z), and the single
> +character X becomes (?X . ?X).  Return the intervals in a list."
> +  ;; We could just do string-to-multibyte on the string and work with
> +  ;; that instead of this `decode-char' workaround.
>    (let ((decode-char
> -         ;; Make sure raw bytes are decoded as such, to avoid confusion with
> -         ;; U+0080..U+00FF.
>           (if (multibyte-string-p str)
>               #'identity
>             (lambda (c) (if (<= #x80 c #xff)
> @@ -483,477 +280,657 @@ rx-check-any-string
>                           c))))

If not using string-to-multibyte, I think this lambda can be replaced
with #'unibyte-char-to-multibyte.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]