[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Ugly regexps

From: Stefan Kangas
Subject: Re: Ugly regexps
Date: Tue, 2 Mar 2021 19:32:23 -0600

Stefan Monnier <monnier@iro.umontreal.ca> writes:

> BTW, while this theme of ugly regexps keeps coming up, how 'bout we add
> a new function `ere` which converts between the ERE style of regexps
> where grouping parens are not escaped (and plain chars meant to match
> an actual paren need to be escaped instead) to ELisp-style regexps?
> So you can do
>     (string-match (ere "\\(def(macro|un|subst) .{1,}"))
> instead of
>     (string-match "(def\\(macro\\|un\\|subst\\) .\\{1,\\}")
> ?

Sounds good to me.

I was going to ask why not just do PCRE, but then I realized I'm not
exactly sure what the syntactical differences are.  (We obviously lack
some features.)  AFAIR, Emacs regexps don't exactly match GNU grep,
egrep, Perl, or anything else really.

So I cranked out my dusty old copy of Mastering Regular Expressions and
found this overview:

    grep           egrep          Emacs          Perl
    \? \+ \|      ? + |          ? + \|         ? + |
    \( \)          ( )            \( \)          ( )
                  \< \>         \< \> \b \B   \b \B

    (Excerpt from Mastering Regular Expressions: Table 3-3: A (Very)
    Superficial Look at the Flavor of a Few Common Tools)

This shows the differences that most commonly bites you, in my

While we're at it, has it ever been discussed to add support for the
pcre library side-by-side with our homegrown regexp.c?  It would give us
sane (standard) syntax and some useful features "for free"
(e.g. lookaround).  I didn't test but a priori I would also assume the
code to be much more performant than anything we could ever cook up
ourselves.  It is used by several high-profile projects.

I would imagine we'd introduce entirely new function names for it.
Perhaps even a completely new and improved API like Lars suggested a
while back.

reply via email to

[Prev in Thread] Current Thread [Next in Thread]