[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: characters set's problem

From: Chris Gray
Subject: Re: characters set's problem
Date: Thu, 30 Oct 2003 12:09:13 +0100

On Thursday 30 October 2003 08:55, jsona laio wrote:
> hi mavens,

Hi jsona laio,

> i used to utilize sun java's vm to develop my porject.
> however, lately i want to participate a porject, in
> which involves developing encoding like CCCII (CJK
> based characters set for asian characters, which
> defines more characters than unicode supports in the
> parts of CJK). however, as i know, java vm is based
> upon unicode. so i hope to know weather there's
> possible to switch encoding setting when using
> classpath (or how to avoid such problem), for i'm
> afraid that code value may miss when data exchanging
> 'twixt two character set; e.g., between CCCII and
> unicode. or any else better way can avoid such
> questions?
> i appreciate any suggestions, sincerely.

Standard Java uses Unicode, stores characters as 16-bit quantities, and does 
not support the use of "surrogates". However there is a library developed by 
IBM, called ICU4J, which permits the use of surrogates: you should then be 
able to use a "private" mapping to represent each 3-byte CCCII character as a 
series of UCS-16 chars.

Of course you could also just pass your CCCII data around as arrays of bytes 
or longs. But then you'd need to define all your own libraries for 
manipulating the data, which probably is not what you want.

ICU4J appears to be under the X licence, which is compatible with 
classpath's, so it should be possible to incorporate ICU4J into classpath. 
You could even lead the project. :)

Best wishes

Chris Gray                                /k/ Embedded Java Solutions
Embedded & Mobile Java, OSGi    
address@hidden                      +32 477 599 703

reply via email to

[Prev in Thread] Current Thread [Next in Thread]