[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: rx.el sexp regexp syntax
From: |
Stefan Monnier |
Subject: |
Re: rx.el sexp regexp syntax |
Date: |
Mon, 04 Jun 2018 09:56:56 -0400 |
User-agent: |
Gnus/5.13 (Gnus v5.13) Emacs/27.0.50 (gnu/linux) |
> Even after removing "extra" backslashes, it's still a bear:
>
> "([0-9][BkKMGTPEZY]?
> (([0-9][0-9][0-9][0-9]-)?[01][0-9]-[0-3][0-9][ T][
> 0-2][0-9][:.][0-5][0-9](:[0-6][0-9]([.,][0-9]+)?(
> ?[-+][0-2][0-9][0-5][0-9])?)?|[0-9][0-9][0-9][0-9]-[01][0-9]-[0-3][0-9])|.*[0-9][BkKMGTPEZY]?
> ((([A-Za-z']|[^\0-])([A-Za-z']|[^\0-])+\\.? +[ 0-3][0-9]|[ 0-3][0-9]\\.?
> ([A-Za-z']|[^\0-])([A-Za-z']|[^\0-])+\\.?)
> +([
> 0-2][0-9][:.][0-5][0-9]|[0-9][0-9][0-9][0-9])|([A-Za-z']|[^\0-])([A-Za-z']|[^\0-])+\\.?
> +[ 0-3][0-9], +[0-9][0-9][0-9][0-9]|([ 0-1]?[0-9]([A-Za-z]|[^\0-])?
> [ 0-3][0-9]([A-Za-z]|[^\0-])? +|[ 0-3][0-9] [ 0-1]?[0-9]
> +)([ 0-2][0-9][:.][0-5][0-9]|[0-9][0-9][0-9][0-9]([A-Za-z]|[^\0-])?))) +"
For such regexps, the exact syntax (PCRE, BRE, ERE, RX, ...) in use has
fairly little importance: if written "raw" as above, it will be
indecipherable in any case.
To make it readable, you need to add human-level explanations
e.g. by adding comments and naming sub-elements. Which is indeed what
is done in the source code:
(defvar directory-listing-before-filename-regexp
(let* ((l "\\([A-Za-z]\\|[^\0-\177]\\)")
(l-or-quote "\\([A-Za-z']\\|[^\0-\177]\\)")
;; In some locales, month abbreviations are as short as 2 letters,
;; and they can be followed by ".".
;; In Breton, a month name can include a quote character.
(month (concat l-or-quote l-or-quote "+\\.?"))
(s " ")
(yyyy "[0-9][0-9][0-9][0-9]")
(dd "[ 0-3][0-9]")
(HH:MM "[ 0-2][0-9][:.][0-5][0-9]")
(seconds "[0-6][0-9]\\([.,][0-9]+\\)?")
(zone "[-+][0-2][0-9][0-5][0-9]")
(iso-mm-dd "[01][0-9]-[0-3][0-9]")
(iso-time (concat HH:MM "\\(:" seconds "\\( ?" zone "\\)?\\)?"))
(iso (concat "\\(\\(" yyyy "-\\)?" iso-mm-dd "[ T]" iso-time
"\\|" yyyy "-" iso-mm-dd "\\)"))
(western (concat "\\(" month s "+" dd "\\|" dd "\\.?" s month "\\)"
s "+"
"\\(" HH:MM "\\|" yyyy "\\)"))
(western-comma (concat month s "+" dd "," s "+" yyyy))
;; Japanese MS-Windows ls-lisp has one-digit months, and
;; omits the Kanji characters after month and day-of-month.
;; On Mac OS X 10.3, the date format in East Asian locales is
;; day-of-month digits followed by month digits.
(mm "[ 0-1]?[0-9]")
(east-asian
(concat "\\(" mm l "?" s dd l "?" s "+"
"\\|" dd s mm s "+" "\\)"
"\\(" HH:MM "\\|" yyyy l "?" "\\)")))
;; The "[0-9]" below requires the previous column to end in a
digit.
;; This avoids recognizing `1 may 1997' as a date in the line:
;; -r--r--r-- 1 may 1997 1168 Oct 19 16:49 README
;; The "[BkKMGTPEZY]?" below supports "ls -alh" output.
;; For non-iso date formats, we add the ".*" in order to find
;; the last possible match. This avoids recognizing
;; `jservice 10 1024' as a date in the line:
;; drwxr-xr-x 3 jservice 10 1024 Jul 2 1997 esg-host
;; vc dired listings provide the state or blanks between file
;; permissions and date. The state is always surrounded by
;; parentheses:
;; -rw-r--r-- (modified) 2005-10-22 21:25 files.el
;; This is not supported yet.
(purecopy (concat "\\([0-9][BkKMGTPEZY]? " iso
"\\|.*[0-9][BkKMGTPEZY]? "
"\\(" western "\\|" western-comma "\\|" east-asian
"\\)"
"\\) +")))
"Regular expression to match up to the file name in a directory listing.
The default value is designed to recognize dates and times
regardless of the language.")
-- Stefan
- Re: rx.el sexp regexp syntax, Eric Abrahamsen, 2018/06/02
- Re: rx.el sexp regexp syntax, Stefan Monnier, 2018/06/02
- Re: rx.el sexp regexp syntax, Eric Abrahamsen, 2018/06/03
- Re: rx.el sexp regexp syntax, Helmut Eller, 2018/06/03
- Re: rx.el sexp regexp syntax, Eric Abrahamsen, 2018/06/03
- Re: rx.el sexp regexp syntax, Helmut Eller, 2018/06/03
- Re: rx.el sexp regexp syntax, Eric Abrahamsen, 2018/06/03
- RE: rx.el sexp regexp syntax, Drew Adams, 2018/06/03
- Re: rx.el sexp regexp syntax, Eric Abrahamsen, 2018/06/03
- RE: rx.el sexp regexp syntax, Drew Adams, 2018/06/03
- Re: rx.el sexp regexp syntax,
Stefan Monnier <=
- RE: rx.el sexp regexp syntax, Drew Adams, 2018/06/04
- Re: rx.el sexp regexp syntax, Pierre Neidhardt, 2018/06/04