[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: characters set's problem
From: |
Chris Gray |
Subject: |
Re: characters set's problem |
Date: |
Thu, 30 Oct 2003 12:09:13 +0100 |
On Thursday 30 October 2003 08:55, jsona laio wrote:
> hi mavens,
Hi jsona laio,
> i used to utilize sun java's vm to develop my porject.
> however, lately i want to participate a porject, in
> which involves developing encoding like CCCII (CJK
> based characters set for asian characters, which
> defines more characters than unicode supports in the
> parts of CJK). however, as i know, java vm is based
> upon unicode. so i hope to know weather there's
> possible to switch encoding setting when using
> classpath (or how to avoid such problem), for i'm
> afraid that code value may miss when data exchanging
> 'twixt two character set; e.g., between CCCII and
> unicode. or any else better way can avoid such
> questions?
> i appreciate any suggestions, sincerely.
Standard Java uses Unicode, stores characters as 16-bit quantities, and does
not support the use of "surrogates". However there is a library developed by
IBM, called ICU4J, which permits the use of surrogates: you should then be
able to use a "private" mapping to represent each 3-byte CCCII character as a
series of UCS-16 chars.
Of course you could also just pass your CCCII data around as arrays of bytes
or longs. But then you'd need to define all your own libraries for
manipulating the data, which probably is not what you want.
ICU4J appears to be under the X licence, which is compatible with
classpath's, so it should be possible to incorporate ICU4J into classpath.
You could even lead the project. :)
Best wishes
--
Chris Gray /k/ Embedded Java Solutions
Embedded & Mobile Java, OSGi http://www.kiffer.be/k/
address@hidden +32 477 599 703