[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]


From: Kenichi Handa
Subject: Re: UCS-2BE
Date: Fri, 01 Sep 2006 21:26:59 +0900
User-agent: SEMI/1.14.3 (Ushinoya) FLIM/1.14.2 (Yagi-Nishiguchi) APEL/10.2 Emacs/22.0.50 (i686-pc-linux-gnu) MULE/5.0 (SAKAKI)

Thank you for the info! 

In article <address@hidden>, YAMAMOTO Mitsuharu <address@hidden> writes:

> "Unicode Technical Report #17, Character Encoding Model"
> (http://www.unicode.org/reports/tr17/index.html) says:
>   Examples of Unicode Character Encoding Schemes:
>     Unicode 1.1 had three character encoding schemes: UTF-8, UCS-2BE,
>     and UCS-2LE, although the latter two were not named that way at
>     the time.

Ah!  So here we can see the term "UCS-2BE" as CES.  But how
it was defined? (I don't have Unicode 1.1)

> I suspect "UCS-2BE" is just a customary name and not explicitly
> defined even in ISO/IEC 10646.

> "UTF-8 and Unicode FAQ" (http://www.cl.cam.ac.uk/~mgk25/unicode.html)
> says:

>   No endianess is implied by the encoding names UCS-2, UCS-4, UTF-16,
>   and UTF-32, though ISO 10646-1 says that Bigendian should be
>   preferred unless otherwise agreed.  It has become customary to
>   append the letters "BE" (Bigendian, high-byte first) and "LE"
>   (Littleendian, low-byte first) to the encoding names in order to
>   explicitly specify a byte order.

I don't know how much authorized this page is, but it also

    A full featured character encoding converter will have
    to provide the following 13 encoding variants of Unicode
    and UCS:

        UCS-2, UCS-2BE, UCS-2LE, UCS-4, UCS-4LE, UCS-4BE,
        UTF-8, UTF-16, UTF-16BE, UTF-16LE, UTF-32, UTF-32BE,

It seems that UCS-2BE is not a mis-label of UTF-16BE, then,
it seems that treating it as a subset (not using surrogate
pair) of UTF-16BE (as done in iconv) is the right thing.
I'll try to implement it (and others) in emacs-unicode-2.

By the way, why do people want such many variants... sigh...

Kenichi Handa

reply via email to

[Prev in Thread] Current Thread [Next in Thread]