[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#25366: 26.0.50; [:blank:] character class should match all Unicode h

From: Eli Zaretskii
Subject: bug#25366: 26.0.50; [:blank:] character class should match all Unicode horizontal whitespace
Date: Fri, 06 Jan 2017 17:11:48 +0200

> From: Philipp Stephani <address@hidden>
> Date: Fri, 06 Jan 2017 15:00:22 +0000
> Cc: address@hidden
>  http://www.unicode.org/reports/tr18/tr18-19.html#Compatibility_Properties
>  Patches to that effect are welcome.
> Here's a patch. 

Thanks.  A few minor comments below.

> +/* Return true if C is a horizontal whitespace character, as defined
> +   by http://www.unicode.org/reports/tr18/tr18-19.html#blank.  */
> +bool
> +blankp (int c)
> +{
> +  if (c == '\t')
> +    return true;

Why does this test explicitly only for a TAB?  What about SPC, for

> --- a/doc/lispref/searching.texi
> +++ b/doc/lispref/searching.texi
> @@ -553,7 +553,10 @@ Char Classes
>  (@pxref{Character Properties}) indicates they are alphabetic
>  characters.
>  @item [:blank:]
> -This matches space and tab only.
> +This matches horizontal whitespace, as defined by Unicode Technical
> +Standard #18.  In particular, it matches tabs and characters whose
> +Unicode @samp{general-category} property (@pxref{Character
> +Properties}) indicates they are spacing separators.

Similarly here: I find the lack of reference to a space potentially

> +** The regular expression character class [:blank:] now matches
> +Unicode horizontal whitespace as defined in
> +http://www.unicode.org/reports/tr18/tr18-19.html#blank.

The reference to a particular version of UTS#18 might become obsolete
when a new version is released.  So I suggest to provide a general
reference to the report and its section, not an exact URL.

reply via email to

[Prev in Thread] Current Thread [Next in Thread]