bug-apl
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Bug-apl] Regex support


From: Xiao-Yong Jin
Subject: Re: [Bug-apl] Regex support
Date: Wed, 20 Sep 2017 15:12:40 -0500

An APL wrapper (⎕regexp[OP]) of a simple API like this would be great, (rune 
means unicode)

https://9fans.github.io/plan9port/man/man3/regexp.html

One can build more APL functions out of these without much performance penalty.

On the other hand, if there is an DFA implementation provided by APL (c.f. J's 
dyadic ;:)

http://www.jsoftware.com/help/dictionary/d332.htm

one can probably write the regular expression engine within an APL function 
with minimal performance lost.

> On Sep 20, 2017, at 2:47 PM, Juergen Sauermann <address@hidden> wrote:
> 
> Hi Elias,
> 
> I am generally in favour of supporting regular expressions in GNU APL.
> 
> We should do that in a way that is compatible with the way in which the most 
> commonly used libraries
> do that (even if they are lacking some features that more exotic libraries 
> may have. Unfortunately I do not
> have a full overview of all (or even any) existing libraries. I personally 
> love grep and hate perl (the latter not
> only because of their regexes).
> 
> I would like to avoid constructs like s/aaa/bbb/ where operations are kind of 
> text-encoded into strings.
> That is, IMHO, a  hack-ish programming style and should be replaced by a more 
> APL-alike syntax such as
> 'aaa' ⎕REX['s'] 'bbb' or maybe 's' ⎕REX 'aaa' 'bbb'. 
> 
> Or, if the number of operations is small (perl seems to have only 2, not 
> counting the translate which is already
> covered by other APL functions), then we could also have different 
> ⎕-functions for them and thus avoiding a
> third argument.
> 
> Everybody else, please feel invited to join the discussion.
> 
> Best Regards,
> Jürgen Sauermann
> 
> 
> On 09/20/2017 05:59 AM, Elias Mårtenson wrote:
>> On several occasions, I have felt that built-in regex support in GNU APL 
>> would be very helpful.
>> 
>> Implementing it should be rather simple, but I'd like to discuss how such an 
>> API should look in order for it to be as useful as possible.
>> 
>> I was thinking of the following form:
>> 
>>       regex ⎕Regex string
>> 
>> The way I envision this to work, is to have the function return ⍬ if there 
>> is no match, or a string containing the match, if there is one:
>> 
>>       'f..' ⎕Regex 'xzooy'
>> ┏⊖┓
>> ┃0┃
>> ┗━┛
>>       'f..' ⎕Regex 'xfooy'
>> 'foo'
>> 
>> If the regex has subexpressions, those matches should be returned as 
>> individual strings:
>> 
>>       '([0-9]+)-([0-9]+)-([0-9]+) '⎕Regex '2017-01-02'
>> ┏→━━━━━━━━━━━━━━━┓
>> ┃"2017" "01" "02"┃
>> ┗∊━━━━━━━━━━━━━━━┛
>> 
>> This would be a very useful API, and reasonably easy to implement by simply 
>> calling into the standard regcomp() call: 
>> http://pubs.opengroup.org/onlinepubs/009695399/functions/regcomp.html
>> 
>> What do you think? Is this a reasonable way to implement it? Any suggestions 
>> about alternative API's?
>> 
>> Regards,
>> Elias
> 




reply via email to

[Prev in Thread] Current Thread [Next in Thread]