gnustep-dev
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Bug in GNUstep implementation of NSRegularExpression?


From: Mathias Bauer
Subject: Bug in GNUstep implementation of NSRegularExpression?
Date: Tue, 08 Apr 2014 16:14:51 +0200
User-agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.9; rv:24.0) Gecko/20100101 Thunderbird/24.4.0

Hi,

the following simple test program throws an exception:


#import <Foundation/Foundation.h>

int main(int argc, const char * argv[])
{
    @autoreleasepool
    {
        NSString* text = @"h1. Real 
Acme\n\n||{noborder}{left}Item||{right}Price||\n|Testproduct|{right}2 x $59.50|\n| 
|{right}net amount: $100.00|\n| |{right}total amount: $119.00|\n\n\nh2. Thanks for your 
purchase!\n\n\n";

        // NSRegularExpression* expr = [NSRegularExpression 
regularExpressionWithPattern:@".*?$" 
options:NSRegularExpressionAnchorsMatchLines error:NULL];
        // int currentIndex = 27;

        NSRegularExpression* expr = [NSRegularExpression 
regularExpressionWithPattern:@"h[123]\\. " 
options:NSRegularExpressionCaseInsensitive error:NULL];
        int currentIndex = 33;

        [expr firstMatchInString:text options:NSMatchingAnchored 
range:NSMakeRange(currentIndex, [text length]-currentIndex-1)];
    }
    return 0;
}

The call to firstMatchInString will end up in calling uregex_lookingAt (thus carrying out a regex match) and afterwards calling uregex_start and uregext_end (thus retrieving the matched text range). The results of the two latter calls will be used to create an NSRange object in the prepareResult function of NSRegularExpression.m. And because the length of this range is negative, an exception is thrown.

Let's have a look at the data:

The matching region starts at position 33, it ends at the string end.
This region has been set at the regex by calling uregex_setRegion (in the setupRegex function in NSRegularExpression.m).

According to the documentation, uregex_start should return the index in the input string of the start of the text matched. In my book this should be the position of the "h2" near the end of the string.

According to the documentation, uregex_end should return the index in the input string of the position following the end of the text matched. In my book that should be start + 4.

But I get back: 33 for start and 4 for end. That obviously can't work.

I can't believe that the ICU regex implementation (I'm using ICU4.8 on Ubuntu 13.10 64Bit) is broken to this extent, so probably the NSRegularExpression implementation uses it incorrectly. But OTOH I can't spot an obvious error.

Any hints would be greatly appreciated.

Regards,
Mathias



reply via email to

[Prev in Thread] Current Thread [Next in Thread]