[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Regexp] Finding failure point in RE
From: |
Wes Biggs |
Subject: |
Re: [Regexp] Finding failure point in RE |
Date: |
Mon, 26 Jul 2004 14:18:53 -0700 |
User-agent: |
Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.7.1) Gecko/20040707 |
mitch-GNU RegExp List wrote:
I posted this question a while but got no response, so I'll try once more...
Is there a way (or plans to develop a way) to discover where in a regular
expression that matching failed (i.e. didn't find a match)? Or alternately,
but not as useful, where in the regular expression is the last point that
successfully matched the input string?
Background: I created a system that uses regular expressions to match
against the contents of incoming emails that contain output from various
status checks, operational tasks, etc. When a match fails, it is a time
consuming processes to discover where the failure point is. A index into
the regular expression (or input string I guess) that showed where the match
failed would be very useful and time saving.
Hi Mitch -- the short answer is no, there is not currently a way or a
plan to implement this.
I'm assuming you're applying this to a situation where you're using
isMatch() -- otherwise the logic gets a little ambiguous, because a
failed RE will fail at every point along the input.
You could add a method like
int RE::getLengthMatched(input)
which would execute similarly to isMatch() but keep the contextual
information such that
RE.isMatch(input) ==> (RE.getLengthMatched(input) == input.length())
Here's some untested off-the-cuff code you can try adding to RE.java:
public int getLengthMatched(Object o, int index, int eflags) {
CharIndexed input = makeCharIndexed(o, index);
if (firstToken == null) { return 0; } // Trivial case of empty regexp
REMatch m = new REMatch(numSubs, index, eflags);
if (firstToken.match(input, m)) {
int max = 0;
while (m != null) {
if (m.index > max) { max = m.index; }
m = m.next;
}
}
return max;
}
Wes