[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: java.text.RuleBasedCollator && CollationElementIterator.

From: Stephen Crawley
Subject: Re: java.text.RuleBasedCollator && CollationElementIterator.
Date: Mon, 24 May 2004 13:56:11 +1000


Over the weekend I fixed up the Mauve CollationElementIterator testcase.
This now fails in one subtest because it appears that RuleBasedCollator
or CollationElementIterator 'eats' a leading space in the input string to be
collated.  (This behavior does not occur with the JDK 1.4.2 version.)

There is another problem with collating in Classpath at the moment; the
collation rules supported by various are thoroughly broken.  For
example, the collation rules for "en_us" only define digits and
unaccented (latin) letters.

It would be a right pain if we had to create the collation rulesets from
scratch.  Fortunately, there is an easy answer.  The RuleBasedCollator
class has a public method called "getRules" that returns the ruleset
that was used to create the collator.  This can be used as follows on
a Sun JDK to extract the rulesets.

    1)  Select a locale by calling new Locale(String, String, String)
    2)  Get the locale's collator
    3)  Cast to a RuleBasedCollator.
    4)  Call getRules()
    5)  Convert the resulting String to Java String literal and output.

If anyone wants to take on the task of correcting the Classpath locales
to use JDK compatible collation rules, I can provide a little Java app
that does the above.  Alternatively, I can run the app against a copy of
JDK 1.4.2 for any locale of interest and email you the results.

-- Steve

reply via email to

[Prev in Thread] Current Thread [Next in Thread]