Re: [Grep-devel] [bug-gawk] GNU grep, awk, sed: support \u and \U for un

sed-devel

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Grep-devel] [bug-gawk] GNU grep, awk, sed: support \u and \U for un

From:	Paul Eggert
Subject:	Re: [Grep-devel] [bug-gawk] GNU grep, awk, sed: support \u and \U for unicode
Date:	Thu, 19 Jan 2017 18:48:59 -0800
User-agent:	Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.5.1

Assaf Gordon wrote:

Currently, escape sequences are parsed and converted before
being sent to re/dfa.
Thus, '[\u0041]' is equivalent to '[A]'

POSIX requires [\u0041] to be equivalent to [u0041\], that is, it matches any ofthe characters '\', 'u', '0', '4', and '1'. This is true for grep, sed, and mostother utilities that use regular expressions. (awk is an exception.) So exceptfor awk, we can't simply translate \u escapes everywhere. At best we couldtranslate them only if not POSIXLY_CORRECT.

On another topic, if we can't implement \N escapes in general then I wouldn'tbother with implementing only \N{U+nnnn}.

[Prev in Thread]

Current Thread

[Next in Thread]

GNU grep,awk,sed: support \u and \U for unicode, Assaf Gordon, 2017/01/10
- Re: GNU grep,awk,sed: support \u and \U for unicode, Assaf Gordon, 2017/01/11
- Re: [Grep-devel] GNU grep,awk,sed: support \u and \U for unicode, Paul Eggert, 2017/01/11
- Re: [bug-gawk] GNU grep,awk,sed: support \u and \U for unicode, arnold, 2017/01/11
  - Re: GNU grep,awk,sed: support \u and \U for unicode, Assaf Gordon, 2017/01/19
    - Re: [bug-gawk] GNU grep,awk,sed: support \u and \U for unicode, Eli Zaretskii, 2017/01/19
    - Re: GNU grep,awk,sed: support \u and \U for unicode, Assaf Gordon, 2017/01/19
- Re: [bug-gawk] GNU grep,awk,sed: support \u and \U for unicode, Norihiro Tanaka, 2017/01/19
  - Re: [bug-gawk] GNU grep,awk,sed: support \u and \U for unicode, Assaf Gordon, 2017/01/19
    - Re: [Grep-devel] [bug-gawk] GNU grep, awk, sed: support \u and \U for unicode, Paul Eggert <=

Prev by Date: Re: [bug-gawk] GNU grep,awk,sed: support \u and \U for unicode
Next by Date: Re: [bug-gawk] GNU grep,awk,sed: support \u and \U for unicode
Previous by thread: Re: [bug-gawk] GNU grep,awk,sed: support \u and \U for unicode
Next by thread: sed suggestion: selinux context based on symlink when using -i
Index(es):
- Date
- Thread