[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Emacs needs truely useful flex matching
From: |
Stefan Monnier |
Subject: |
Re: Emacs needs truely useful flex matching |
Date: |
Mon, 15 Apr 2013 09:50:50 -0400 |
User-agent: |
Gnus/5.13 (Gnus v5.13) Emacs/24.3.50 (gnu/linux) |
> Consider "fafbfcfd-bcd", your regexp will match "afbfcfd" from the first
> word, but we want to prefer "a" from the first word, "bcd" from the second
> word.
Duh! Thanks for spelling it out.
So indeed you'd need regexps like
(re "ab...") = (concat "\\<a\\([^b]*\\)" (re "b...")
"\\|a\\([^b]*\\)" (re "b..."))
which blows up very quickly both in size and in matching time.
You can probably make it work with regexps, but it'll take more work:
- Match using the kind of regexp I suggested. Use it to filter out
uninteresting candidates.
- For those elements that match, try to match with stronger regexps
to see if the match's value can be improved. I.e. for "abcd", you'd
first try "\\<a\\([^b]*\\)b\\([^c]*\\)c\\([^d]*\\)d", then
either try "a\\([^b]*\\)\\<b\\([^c]*\\)c\\([^d]*\\)d" or
try "\\<a\\([^b]*\\)\\<b\\([^c]*\\)c\\([^d]*\\)d" depending on the
first test, etc...
For an N-char key, that's O(N) regexp-matches, tho it may also turn out
to require O(N) regexp-compilations, so it's probably OK but might
require some extra work to optimize it further.
> Actually can I skip the GC somehow since I know I'm about to allocate a
> bunch of memory that never gets freed.
You can let-bound gc-cons-threshold, but I think it's not worth
the risk (the let-binding may apply to more code than expected in
situations such as debugging).
Stefan