emacs-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Patch for lookaround assertion in regexp


From: Tomohiro MATSUYAMA
Subject: Patch for lookaround assertion in regexp
Date: Thu, 4 Jun 2009 08:04:25 +0900

Hi, all

I have attached a patch that enables you to
use lookaround assertion in regexp
with following syntax:

* Positive lookahead assertion
    \(?=...\)
* Negative lookahead assertion
    \(?!...\)
* Positive lookbehind assertion
    \(?<=...\)
* Negative lookbehind assertion
    \(?<!...\)

Basically, it works as same as Perl's one.

Spec:
* Any pattern is allowed in lookahead assertion.
* Nested looaround assertion is allowed.
* Capturing is allowed only in positive lookahead/lookbehind assertion.
* Duplication is allowed after such assertion.
* Variable length pattern is NOT yet allowed in lookbehind assertion.
  [x] \(?<=[0-9]+\)MB
  [o] \(?<=[0-9][0-9][0-9][0-9]\)MB
* Lookahead assertion over start bound is not allowed in re-search-backward.
  (re-search-backward "\(?<=a\)b") for buffer "abca_|_b"
  will seek to first "ab".

As of performace, I think there is no problem about lookahead assertion,
but lookbehind assertion is somewhat high cost.
You can check this patch works properly with a testcase I have attached
and also see performance:
    src/emacs --script regex-test.el perf

I saw that lookbehind assertion will spend 5 times than usual lookbehind alike
regexp. I think I have to improve its performance.

Anyway, please try it and review it.
And if like it, please merge it.
I believe that some people really want to use it.

Regards,
MATSUYAMA Tomohiro

Attachment: regex-test.el
Description: Binary data

Attachment: emacs-regex.patch
Description: Binary data


reply via email to

[Prev in Thread] Current Thread [Next in Thread]