bug-apl
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Bug-apl] Regex support


From: Juergen Sauermann
Subject: Re: [Bug-apl] Regex support
Date: Wed, 20 Sep 2017 21:47:29 +0200
User-agent: Mozilla/5.0 (X11; Linux i686; rv:45.0) Gecko/20100101 Thunderbird/45.2.0

Hi Elias,

I am generally in favour of supporting regular expressions in GNU APL.

We should do that in a way that is compatible with the way in which the most commonly used libraries
do that (even if they are lacking some features that more exotic libraries may have. Unfortunately I do not
have a full overview of all (or even any) existing libraries. I personally love grep and hate perl (the latter not
only because of their regexes).

I would like to avoid constructs like s/aaa/bbb/ where operations are kind of text-encoded into strings.
That is, IMHO, a  hack-ish programming style and should be replaced by a more APL-alike syntax such as
'aaa' ⎕REX['s'] 'bbb' or maybe 's' ⎕REX 'aaa' 'bbb'.

Or, if the number of operations is small (perl seems to have only 2, not counting the translate which is already
covered by other APL functions), then we could also have different ⎕-functions for them and thus avoiding a
third argument.

Everybody else, please feel invited to join the discussion.

Best Regards,
Jürgen Sauermann


On 09/20/2017 05:59 AM, Elias Mårtenson wrote:
On several occasions, I have felt that built-in regex support in GNU APL would be very helpful.

Implementing it should be rather simple, but I'd like to discuss how such an API should look in order for it to be as useful as possible.

I was thinking of the following form:

      regex ⎕Regex string

The way I envision this to work, is to have the function return ⍬ if there is no match, or a string containing the match, if there is one:

      'f..' ⎕Regex 'xzooy'
┏⊖┓
┃0┃
┗━┛
      'f..' ⎕Regex 'xfooy'
'foo'

If the regex has subexpressions, those matches should be returned as individual strings:

      '([0-9]+)-([0-9]+)-([0-9]+) '⎕Regex '2017-01-02'
┏→━━━━━━━━━━━━━━━┓
┃"2017" "01" "02"┃
┗∊━━━━━━━━━━━━━━━┛

This would be a very useful API, and reasonably easy to implement by simply calling into the standard regcomp() call: http://pubs.opengroup.org/onlinepubs/009695399/functions/regcomp.html

What do you think? Is this a reasonable way to implement it? Any suggestions about alternative API's?

Regards,
Elias


reply via email to

[Prev in Thread] Current Thread [Next in Thread]