[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [emacs-bidi] bidi categories, derived from Unicode data
From: |
Alex Schroeder |
Subject: |
Re: [emacs-bidi] bidi categories, derived from Unicode data |
Date: |
Sat, 10 Nov 2001 01:35:27 +0100 |
User-agent: |
Gnus/5.090004 (Oort Gnus v0.04) Emacs/21.1 (i686-pc-linux-gnu) |
"Eli Zaretskii" <address@hidden> writes:
> Here's how:
>
> (decode-char 'ucs uchar)
>
> where UCHAR is the Unicode codepoint. Easy, eh?
Hehe, just what I wanted.
>> -- or better yet, how do I get all characters from all the other
>> charsets matching it?
>
> For this, you will need tables, there's no single method.
I've used the tables Dave Love had in ucs-tables.el. This means that
my table now holds the UAX#9 categories for all UCS characters as well
as for all 8859 characters. All the other charsets remain untouched.
Unfortunately the lisp files required to set this up are rather big,
we'll have to find a way of dumping the info, later. Even worse if we
want to add bidi categories to all the asian charsets...
Attached you will find bidi.el which does the setup. Here's a short
textual description for those who will not read the code. It creates
a variable for every UAX#9 bidi type and gets the necessary number of
unused categories. Every category is identified by a character. This
character is stored in the respective variable. (This is a workaround
because I don't want to fix the categories, yet. It will be removed,
later.)
bidi-table.el is equipped with several tables. One of the tables maps
UCS characters to bidi type (actually, to the symbol which holds the
"real" category character), and several mapping tables from UCS to ISO
8859 charsets, provided by Dave Love (from his ucs-tables.el). Using
this information, the code in bidi-table.el will assign the bidi
categories as specified by the UnicodeData.txt file from unicode.org
to all UCS and to all 8859 characters.
Some UCS characters seem not to exist in Emacs; this was surprising.
An example from the source: (decode-char 'ucs ?\x33FE) -- valid,
(decode-char 'ucs ?\x3400) -- invalid. I don't know what to make of
it. I currently just ignore the UCS characters where decode-char
returns nil.
Alex.
--
http://www.emacswiki.org/
bidi-table.el
Description: application/emacs-lisp
bidi.el
Description: application/emacs-lisp
- Re: [emacs-bidi] Where do I start?, (continued)
- Re: [emacs-bidi] Where do I start?, Eli Zaretskii, 2001/11/07
- Re: [emacs-bidi] diacritics, ligatures, etc., Alex Schroeder, 2001/11/08
- Re: [emacs-bidi] diacritics, ligatures, etc., Eli Zaretskii, 2001/11/08
- Re: [emacs-bidi] diacritics, ligatures, etc., Yair Friedman (Jerusalem), 2001/11/12
- Re: [emacs-bidi] diacritics, ligatures, etc., Eli Zaretskii, 2001/11/12
- Re: [emacs-bidi] bidi categories, Alex Schroeder, 2001/11/08
- Re: [emacs-bidi] bidi categories, Eli Zaretskii, 2001/11/08
- Re: [emacs-bidi] bidi categories, Alex Schroeder, 2001/11/09
- Re: [emacs-bidi] bidi categories, Alex Schroeder, 2001/11/09
- Re: [emacs-bidi] bidi categories, Eli Zaretskii, 2001/11/09
- Re: [emacs-bidi] bidi categories, derived from Unicode data,
Alex Schroeder <=
- Re: [emacs-bidi] bidi categories, derived from Unicode data, Eli Zaretskii, 2001/11/10
- Re: [emacs-bidi] bidi categories, derived from Unicode data, Alex Schroeder, 2001/11/10
- Re: [emacs-bidi] bidi categories, Ehud Karni, 2001/11/10
- Re: [emacs-bidi] bidi categories, Alex Schroeder, 2001/11/10
- Re: [emacs-bidi] bidi categories, Ehud Karni, 2001/11/12
- [emacs-bidi] improve visual-to-logical, Alex Schroeder, 2001/11/13
- Re: [emacs-bidi] improve visual-to-logical, Eli Zaretskii, 2001/11/13
- Re: [emacs-bidi] improve visual-to-logical, Alex Schroeder, 2001/11/13
- Re: [emacs-bidi] improve visual-to-logical, Eli Zaretskii, 2001/11/13
- Re: [emacs-bidi] improve visual-to-logical, Alex Schroeder, 2001/11/13