bug-grep
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH 2/2] tests: add testcase for previous fix


From: Paolo Bonzini
Subject: Re: [PATCH 2/2] tests: add testcase for previous fix
Date: Thu, 23 Sep 2010 18:33:03 +0200
User-agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.2.9) Gecko/20100907 Fedora/3.1.3-1.fc13 Lightning/1.0b3pre Mnenhy/0.8.3 Thunderbird/3.1.3

On 09/23/2010 06:19 PM, Paul Eggert wrote:
On 09/23/10 04:52, Paolo Bonzini wrote:
Or better, we're at glibc's mercy:

$ LC_ALL=cs_CZ.UTF-8 devel/grep/+build/src/grep -E '[A-Z]' in
00a
00g
00A
00G
00Z

Yay for yet another definition of range expressions.

Can we fix things so that we're not at glibc's mercy, even there?
We could preprocess the regular expression [A-Z], and turn it into
[ABCDEFGHIJKLMNOPQRSTUVWXYZ], before we hand it off to glibc.
POSIX would allow this behavior, and users would prefer it.

This could be done in a gnulib module, so that other GNU programs
could also use the fix.

That calls for a huge confusion between tools. It's better sorted out in glibc.

I don't see a reason why glibc should refuse the proposal of differentiating [A-Z] (code point range) from [[.A.]-[.Z.]] (real strcoll comparison, however! Not the current, absurd behavior that everybody hates). If somebody writes the patch, that is.

A small disadvantage is that collation order would not be available anymore in fnmatch, since it is better to keep regex consistent with fnmatch.

Since we are at it, Bruno, you worked a lot on the localization features of glibc. Can you shed light on what __collseq_table_lookup is supposed to mean?

Paolo



reply via email to

[Prev in Thread] Current Thread [Next in Thread]