[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
bug#37849: composable character alternatives in rx
From: |
Mattias Engdegård |
Subject: |
bug#37849: composable character alternatives in rx |
Date: |
Mon, 21 Oct 2019 12:24:21 +0200 |
Now that rx is user-extendible, some holes are showing. Example (from
python.el):
(simple-operator . ,(rx (any ?+ ?- ?/ ?& ?^ ?~ ?| ?* ?< ?> ?= ?%)))
;; FIXME: rx should support (not simple-operator).
(not-simple-operator . ,(rx
(not
(any ?+ ?- ?/ ?& ?^ ?~ ?| ?* ?< ?> ?= ?%))))
(This code uses the old rx-constituents mechanism, but the point applies
equally to new-style definitions.)
More generally, there is currently no way to:
(1) Get the complement of a defined (any ...) form
(2) Get the union of two defined (any ...) forms
(3) Get the intersection of two defined (not (any ...)) forms
(1), which the example above was about, could be solved by expanding
definitions inside 'not'. This is a step away from the principle that
user-defined things are only allowed where general rx forms are, but perhaps
tolerable. Proposed patch attached.
(2) can be solved by expanding definitions inside 'any', and allowing 'any'
inside 'any' (flattening). Not sure I like this.
An alternative is to ensure that (or (any X) (any Y)) -> (any X Y), but then we
either need to allow 'or' inside 'not', or add an intersection operator:
(intersect (not (any X)) (not (any Y)) -> (not (any X Y))
We could also make 'not' variadic, turning it into complement-of-union:
(not (any A) (any B)) -> (not (any A B))
Olin Shivers's SRE has a complete and closed set of operations on character
sets (https://scsh.net/docu/post/sre.html). That would be principled and
perhaps useful, but difficult to do fully in rx because not all such
expressions can be rendered into Emacs regexps. Nothing prevents us from making
a partial implementation, however.
0001-Expand-rx-definitions-inside-not.patch
Description: Binary data
- bug#37849: composable character alternatives in rx,
Mattias Engdegård <=