[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Scan of regexps in Emacs (March 17)

From: Paul Eggert
Subject: Re: Scan of regexps in Emacs (March 17)
Date: Tue, 2 Apr 2019 15:08:43 -0700
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.6.1

On 4/2/19 7:15 AM, Mattias Engdegård wrote:
> where does a user go to understand extant regexps?

A user that *really* wants to know can go read the source code and get
confused, just like I did. :-)

But I think it's better if the documentation doesn't say what happens.
If you prefer that the documentation explicitly say that it doesn't say
what happens, I guess that would be OK too (what sort of wording would
you like, though?).

> (Do we have any latitude at all for changing even obscure corners of
> regexp syntax and semantics today?)

I would say so, certainly for the raw 8-bit-bytes in ranges stuff (where
nobody knows what they mean or even should mean), and possibly even for
some of the other rarely-used and questionable uses.

> I've attached the ones found by a modified relint/xr, in case you are 
> interested.

Sure! Fixed in the attached patch.

> +A character alternative can include duplicates.  For example,
> address@hidden is less clear than @samp{[XYa-z]}.
> Certainly, but does this need to be mentioned? Overlapping ranges are rarely 
> written on purpose. Besides, duplication isn't confined to ranges.

That example does contains non-range duplicates. I think duplicates are
worth mentioning (if only so that your trawler can point to the style
advice if people complain about the trawler being too picky :-).

> More useful, I think, would be to recommend ranges to stay within natural 
> sequences (letters, digits, etc) so that a reader needn't consult a table to 
> see what is included. Thus [0-9.:/] good, [.-:] bad, even though they denote 
> the same set.
Good idea. I did that in the attached patch, which I just installed into
master and I hope addresses the points you raised. I hope that the Thai
example doesn't mess things up (I considered doing Arabic, which would
have been more fun :-).

Attachment: 0001-Improve-regexp-advice-again-and-unchain-ranges.txt
Description: Text document

reply via email to

[Prev in Thread] Current Thread [Next in Thread]