[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[emacs-bidi] Arabic in Unicode
From: |
TAKAHASHI Naoto |
Subject: |
[emacs-bidi] Arabic in Unicode |
Date: |
Thu, 8 Nov 2001 17:06:47 +0900 (JST) |
User-agent: |
SEMI/1.14.3 (Ushinoya) FLIM/1.14.2 (Yagi-Nishiguchi) APEL/10.2 Emacs/21.1 (sparc-sun-solaris2.8) MULE/5.0 (SAKAKI) |
Alex Schroeder writes:
> I don't know how Arab is represented in Unicode. Picking up my book,
> however, I see that there are often three ways of writing a letter --
> one for the beginning of words, one for within words, one for the end
> of words.
Not three but four, actually; the fourth is called "isolated form".
> Are these represented as different characters in Unicode
> (or Latin 6)?
In Latin-6, no.
In Unicode, yes and no.
Latin-6 defines only one codepoint for each Arabic letter; glyph
variations (word beginning, middle, end or isolated) are not
represented.
Unicode defines one basic codepoint plus up to four
representation-codepoints for each Arabic letter. Representation
forms are prepared for compatibility purpose and their usage is
discouraged.
Alex Schroeder writes:
> When Emacs uses Unicode internally, then we can move the table to
> Unicode. At the moment, however, I think we should use Mule because
> of the following reasons:
> 1. Nobody knows when Unicode will happen.
Just for your info: here, Handa and I have already succeeded in
displaying Arabic letters using Emacs-21.0.104. They are represented
in the mule-unicode-0100-24ff charset. The buffer contains only basic
forms; appropriate glyph variations are selected on the fly, using a
mechanism similar to font-lock, and displayed on the screen. Bi-di
support is minimum; it uses the same algorithm as in Mule-2.3.
To release the modification, we need to get an official permission
from the responsible section of our institute. A necessary procedure
has just begun, but nobody knows how long it will take.
--
TAKAHASHI Naoto
address@hidden
http://www.m17n.org/ntakahas/
- [emacs-bidi] Arabic in Unicode,
TAKAHASHI Naoto <=