bug-gnu-emacs
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#37659: rx additions: anychar, unmatchable, unordered-or


From: Robert Pluim
Subject: bug#37659: rx additions: anychar, unmatchable, unordered-or
Date: Tue, 22 Oct 2019 17:27:48 +0200

>>>>> On Tue, 22 Oct 2019 17:14:08 +0200, Mattias Engdegård <mattiase@acm.org> 
>>>>> said:

    Mattias> 'regexp-opt' always generates a regexp preferring long matches. 
This
    Mattias> is undocumented, but useful enough that I would be surprised if 
this
    Mattias> property wasn't exploited (perhaps unknowingly) by callers. It's 
quite
    Mattias> natural: given a set of strings, surely the caller want them all 
to be
    Mattias> candidates for a match, even if there is no following anchoring
    Mattias> pattern.

    Mattias> Thus, instead of 'unordered-or', define the operator in terms of 
long
    Mattias> matches: 'or-max' (working name) would work like 'or' but 
guarantee a
    Mattias> longest match, and only permit strings and 'or-max' forms as
    Mattias> arguments. Thus, the rx user gets all the benefits from 
'regexp-opt'
    Mattias> in a composable way, without a need to sort the strings or 
otherwise
    Mattias> prepare them.

    Mattias> (The old 'or' behaviour always used 'regexp-opt' when possible, 
which
    Mattias> was very fragile: (or "a" "ab") would match "ab", but (or "a" "ab"
    Mattias> digit) would just match "a". 'or-max' is robust, without 
surprises.)

    Mattias> Of course, we should also guarantee the maximum-matching property 
of
    Mattias> regexp-opt. This is just a matter of documentation (and test); it 
does
    Mattias> not restrict optimisations as far as I can tell.

    Mattias> Again, I'm open to suggestions about a better name than 'or-max'.

or-greedy?





reply via email to

[Prev in Thread] Current Thread [Next in Thread]