[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Bug in GNUstep implementation of NSRegularExpression?
From: |
Fred Kiefer |
Subject: |
Re: Bug in GNUstep implementation of NSRegularExpression? |
Date: |
Fri, 11 Apr 2014 23:54:13 +0200 |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:24.0) Gecko/20100101 Thunderbird/24.3.0 |
On 08.04.2014 16:14, Mathias Bauer wrote:
> Hi,
>
> the following simple test program throws an exception:
>
>
>> #import <Foundation/Foundation.h>
>>
>> int main(int argc, const char * argv[])
>> {
>> @autoreleasepool
>> {
>> NSString* text = @"h1. Real
>> Acme\n\n||{noborder}{left}Item||{right}Price||\n|Testproduct|{right}2
>> x $59.50|\n| |{right}net amount: $100.00|\n| |{right}total amount:
>> $119.00|\n\n\nh2. Thanks for your purchase!\n\n\n";
>>
>> // NSRegularExpression* expr = [NSRegularExpression
>> regularExpressionWithPattern:@".*?$"
>> options:NSRegularExpressionAnchorsMatchLines error:NULL];
>> // int currentIndex = 27;
>>
>> NSRegularExpression* expr = [NSRegularExpression
>> regularExpressionWithPattern:@"h[123]\\. "
>> options:NSRegularExpressionCaseInsensitive error:NULL];
>> int currentIndex = 33;
>>
>> [expr firstMatchInString:text options:NSMatchingAnchored
>> range:NSMakeRange(currentIndex, [text length]-currentIndex-1)];
>> }
>> return 0;
>> }
>
> The call to firstMatchInString will end up in calling uregex_lookingAt
> (thus carrying out a regex match) and afterwards calling uregex_start
> and uregext_end (thus retrieving the matched text range). The results of
> the two latter calls will be used to create an NSRange object in the
> prepareResult function of NSRegularExpression.m. And because the length
> of this range is negative, an exception is thrown.
>
> Let's have a look at the data:
>
> The matching region starts at position 33, it ends at the string end.
> This region has been set at the regex by calling uregex_setRegion (in
> the setupRegex function in NSRegularExpression.m).
>
> According to the documentation, uregex_start should return the index in
> the input string of the start of the text matched. In my book this
> should be the position of the "h2" near the end of the string.
>
> According to the documentation, uregex_end should return the index in
> the input string of the position following the end of the text matched.
> In my book that should be start + 4.
>
> But I get back: 33 for start and 4 for end. That obviously can't work.
>
> I can't believe that the ICU regex implementation (I'm using ICU4.8 on
> Ubuntu 13.10 64Bit) is broken to this extent, so probably the
> NSRegularExpression implementation uses it incorrectly. But OTOH I can't
> spot an obvious error.
>
> Any hints would be greatly appreciated.
No hint, just some feedback. I was able to reproduce you problem on my
GNUstep installation but completely failed to understand why uregex
comes up with 4 as the result of uregex_end.