[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: locale specific ordering in EN_US -- why is a<A<b<B<y<Y<z<Z?

From: Linda Walsh
Subject: Re: locale specific ordering in EN_US -- why is a<A<b<B<y<Y<z<Z?
Date: Mon, 21 May 2012 20:19:31 -0700
User-agent: Mozilla/5.0 (Windows; U; Windows NT 6.0; en-US; rv: Gecko/20100228 Lightning/0.9 Thunderbird/ Mnenhy/

Eric Blake wrote:

They still don't make any sense in any locale except C, because POSIX no
longer requires collating order.

The regex(7) man page says that [xx-xx] uses ***collating order**::

The regex(7) man page _of which system_?  Just because _some_ systems
(like glibc, picking the POSIX 1992 semantics) have well-defined
semantics, doesn't mean that all systems have those same semantics.
According to POSIX, you cannot portably assume ANY semantics for ranges
except in the C locale.  And if RRI gains traction, that means that you
can assume ASCII collation, across ALL locales, but this is a different
order than collation of a specific locale, and it is also a GNU
extension not guaranteed by POSIX.

        Well, that would be nice, but if Unicode takes off, *cough*,
and anyone claims unicode compliance (isn't UTF-8 the standard for HTML5
and XML?), they are also guaranteed ordering -- full ordering for the full
Unicode character set.

        It would be VERY GOOD if RRI didn't come up with an order that
was DIFFERENT from that prescribed by Unicode -- otherwise that could open
another can of worms.

reply via email to

[Prev in Thread] Current Thread [Next in Thread]