[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: regexp-quote missing escapes in grouping constructs - Bug?

Subject: Re: regexp-quote missing escapes in grouping constructs - Bug?
Date: Fri, 13 Jun 2008 13:36:07 -0400

Y. but why are the "?" a"[" and "+" getting escaped regardless of the
presence of a preceding \ whereas the alternative "|" inside the
grouping construct isn't?


example 1)

(regexp-quote "[0-9]{2,4}(-|/)[0-9]?+(-|/)[0-9]{2,4}")
---> "\\[0-9]{2,4}(-|/)\\[0-9]\\?\\+(-|/)\\[0-9]{2,4}"

as compared to 2);

(regexp-quote "[0-9]{2,4}(-?+/)[0-9]?+(-|/)[0-9]{2,4}")
---> "\\[0-9]{2,4}(-\\?\\+/)\\[0-9]\\?\\+(-|/)\\[0-9]{2,4}"

in the second case the ?+ nested inside the group is getting escaped.

Is the "|" not considered a special operator or emacs regexp
metacharacter in the regexp-quote situation?

And if so, why not?
The issue is that regexp-opt.el is calling regexp-quote

If i understand the implications of regexp-opt  it is meant as a
helper function for passing regexps to font-lock-keywords and isn't
intended to accept or 'optimize' existing regexp "words".  So, to feed
a well-formed regexp to font-lock-add-keywords I need to build the
regexp by hand. Likewise, that regexp needs to be passed as a string
with all special characters properly escaped e.g.

(defconst stupid-mode-keywords
'("^[A-z]\\?\\+\\(some\\|stupid\\|regexp\\)\\{2,4\\}" . my-stupid-mode-face))

Am I to understand that the ? and + should be escaped but the |
shouldn't be in order for the regexp to work with font-lock?

FWIW my epierience is otherwise, and the previous case doesn't work,
whereas the following does:

(defconst stupid-mode-keywords
'("^[A-z]?+\\(some\\|stupid\\|regexp\\)\\{2,4\\}" . my-stupid-mode-face))

For my purposes, the larger issue is that I can't find a sensible way
to cons or append a well formed regexp to an existing one without
running into regexp-quote and regexp-opt confusion esp. as I am
unclear as to the correctness of the quoting and escaping of regexps
for font-locking by the two respective functions.

The only solution that seems approachable is to make a new defconst
defvar and defface for each new regexp i wish to font-lock.  This
approach is not really particularly maintanable over the longterm.

On Fri, Jun 13, 2008 at 2:17 AM, Miles Bader <address@hidden> wrote:
> "St/n_P/rm/n" <address@hidden> writes:
>> (regexp-quote "[0-9]\{2,4\}\(-\|/\)[0-9]?+\(-\|/\)[0-9]\{2,4\}")
>> ---> "\\[0-9]{2,4}(-|/)\\[0-9]\\?\\+(-|/)\\[0-9]{2,4}"
>> Am I misunderstanding something?
> The backslashes you entered in the original lisp string were eaten by
> the lisp reader, so there are no backslashes in the string.  Since (, ),
> |, etc., are not emacs regexp metacharacters (without a preceding
> backslash), there's no need to quote them.
> Here's what you probably meant:
> (regexp-quote "[0-9]\\{2,4\\}\\(-\\|/\\)[0-9]?+\\(-\\|/\\)[0-9]\\{2,4\\}")
> => 
> "\\[0-9]\\\\{2,4\\\\}\\\\(-\\\\|/\\\\)\\[0-9]\\?\\+\\\\(-\\\\|/\\\\)\\[0-9]\\\\{2,4\\\\}"
> -Miles
> --
> Joy, n. An emotion variously excited, but in its highest degree arising from
> the contemplation of grief in another.

reply via email to

[Prev in Thread] Current Thread [Next in Thread]