octave-maintainers
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: new strsplit function


From: Ben Abbott
Subject: Re: new strsplit function
Date: Tue, 23 Apr 2013 21:14:45 -0400

On Apr 23, 2013, at 8:18 AM, Ben Abbott wrote:

> On Apr 20, 2013, at 6:49 PM, Ben Abbott wrote:
> 
>> On Apr 20, 2013, at 5:41 PM, Philip Nienhuis wrote:
>> 
>>> Ben Abbott wrote:
>>>> On Apr 20, 2013, at 3:51 PM, Philip Nienhuis wrote:
>>>> 
>>>>> Ben Abbott wrote:
>>>>>>> On Apr 2, 2013, at 11:42 PM, John W. Eaton wrote:
>>> :
>>> <snip>
>>> :
>>>>>>>> Also, the new strsplit does not seem to be working correctly for 
>>>>>>>> single-quoted strings that contain escape sequences.  I think that
>>>>>>>> 
>>>>>>>> strsplit ("foo\tbar", '\t')
>>>>>>>> 
>>>>>>>> should split on the TAB, but it is currently returning the original 
>>>>>>>> string for me.
>>>>>>> 
>>>>>>> Opps.  I hadn't considered single quoted strings.  Your example does 
>>>>>>> work in Matlab.  How is this sort of thing handled in other places?  It 
>>>>>>> is sufficient to just ...
>>> 
>>> <snip>
>>> 
>>>>> And then there is the bug John mentioned with escape characters.
>>>>> On a stable MinGW 3.6.4 build it doesn't work either:
>>>>> 
>>>>> octave:1>  strsplit ("foo\tbar", '\t')
>>>>> ans =
>>>>> {
>>>>> [1,1] = foo   bar
>>>>> }
>>>> 
>>>> Is this with the changeset I attached applied?
>>> 
>>> Sorry I got a but confused, didn't pick up that your "sq" cs was for that 
>>> very issue.
>>> 
>>> I applied it manually to yesterday's MXE-build (from yesterday's updated 
>>> source tree) and now it works correctly:
>>> 
>>> octave-cli:1> test strsplit
>>> PASSES 38 out of 38 tests
>>> octave-cli:2> strsplit ("foo\tbar", '\t')
>>> ans =
>>> {
>>> [1,1] = foo
>>> [1,2] = bar
>>> }
>>> octave-cli:3>
>>> 
>>> 
>>> Anyway, in view of JWE's comments, what is going to happen now with 
>>> strsplit.m?
>>> Do we keep the changes and the "legacy" option, or do we simply follow ML 
>>> as John suggested?
>>> 
>>> As long as speed isn't affected (-significantly) for simple cases I don't 
>>> care very much.
>> 
>> Rik indicated the slow down was about 30x.  If we keep the "legacy" code, 
>> but don't want to commit to supporting the "legacy" option, then I'd just 
>> eliminate "legacy" from the doc-string and throw an error if the user 
>> requests it.
>> 
>>> You have already patched the affected core functions.
>>> In the io package (spreadsheet functions) there are only "simple" calls to 
>>> strsplit (i.e. with just a one-char delimiter) so am I right to assume no 
>>> changes are required there?
>> 
>> If we get rid of the "legacy", then only list_forge_packages.m and strread.m 
>> need to be patched.
>> 
>>> BTW in the texinfo help of strsplit there's neither an (explicit) 
>>> explanation of what the 'simple' DELIMITERTYPE does, nor an example.
>> 
>> I can fix that.  The "simple" version doesn't support regexp expressions.   
>> Although the current implementation of "simple" does rely upon regexp.  
>> There may be some some cases (bugs) where regexp stuff sneaks past by my 
>> clumsy attempt to escape special characters.  Perhaps it is a good idea to 
>> rewrite that part so as not to rely on regexp?
>> 
>> Ben
> 
> I got side tracked looking at how to correctly collect the collapsed 
> delimiters for the 2nd output.  
> 
>       http://hg.savannah.gnu.org/hgweb/octave/rev/03a28487fa9d
> 
> I'll prepare the changeset for sq-strings next.
> 
> Ben

I've pushed a changeset to add support for delimiters that are sq-strings.

        http://hg.savannah.gnu.org/hgweb/octave/rev/d50bca1cdc22

Ben



reply via email to

[Prev in Thread] Current Thread [Next in Thread]