[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: More unicode blocks?
Re: More unicode blocks?
Wed, 28 Sep 2005 12:22:35 -0600
Mozilla Thunderbird 0.9 (X11/20041105)
Shaddy Baddah wrote:
> Today, I finally did what I had resolved to do some time ago. I delved
> into emacs's unicode support facilities.
> I am a little disappointed, because it has become apparent that the
> unicode character set support is limited to 3 specific blocks of the
> full unicode character set, those being the blocks that start and end at
> the indexes expressed in mule-unicode-0100-24ff, mule-unicode-2500-33ff
> and mule-unicode-e000-ffff.
> The blocks that I am interested in are the CJK Unified Ideographs blocks
> , that start at unicode index 0x4E00. Specifically, the characters that
> are shared by the character set encoded via the big5 encoding scheme.
Perhaps you should try Emacs 22 (aka CVS Emacs). Here are some items
from its etc/NEWS file:
*** The utf-8/16 coding systems have been enhanced.
By default, untranslatable utf-8 sequences are simply composed into
single quasi-characters. User option `utf-translate-cjk-mode' (it is
turned on by default) arranges to translate many utf-8 CJK character
sequences into real Emacs characters in a similar way to the Mule-UCS
system. As this loads a fairly big data on demand, people who are not
interested in CJK characters may want to customize it to nil.
You can augment/amend the CJK translation via hash tables
`ucs-mule-cjk-to-unicode' and `ucs-unicode-to-mule-cjk'. The utf-8
coding system now also encodes characters from most of Emacs's
one-dimensional internal charsets, specifically the ISO-8859 ones.
The utf-16 coding system is affected similarly.
*** A new coding system `euc-tw' has been added for traditional Chinese
in CNS encoding; it accepts both Big 5 and CNS as input; on saving,
Big 5 is then converted to CNS.
*** New variable `utf-translate-cjk-unicode-range' controls which
Unicode characters to translate in `utf-translate-cjk-mode'.
*** iso-10646-1 (`Unicode') fonts can be used to display any range of
characters encodable by the utf-8 coding system. Just specify the
> I have no problems displaying and editing these characters under the
> big5 coding scheme, so they are obviously well supported by emacs (and
> it's internal coding scheme, right?).
> So, what is the impediment, or perhaps rationale, behind the lack of
> support for the additional unicode blocks at this stage of Emacs
> Is it simply to do with someone having to implement some type of
> character translation tables, or is there/how much more is there to it?
Sorry, I don't know the answers to those questions.