[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: idn.el and confusables.txt
Re: idn.el and confusables.txt
Sun, 15 May 2011 20:34:55 +0300
> From: Kenichi Handa <address@hidden>
> Cc: address@hidden, address@hidden, address@hidden
> Date: Sun, 15 May 2011 22:06:23 +0900
> In article <address@hidden>, Eli Zaretskii <address@hidden> writes:
> > You see, the uni-*.el files we create out of the Unicode DB are not
> > used anywhere in application code, AFAIK. We use them to display
> > character properties in the likes of "C-u C-x =", and that's it.
> composite.el uses `general-category' and `canonical-combining-class'.
> ucs-normalize.el uses `decomposition' and `canonical-combining-class'.
> mule-cmds.el uses `name' and `old-name' for read-char-by-name.
Are functions defined by ucs-normalize.el used anywhere?
> Why did you have to create another table? Was it because
> get-char-code-property is defiend by Lisp and not efficient
> to call from C?
Yes, calling a Lisp function (one that calls `load' at that!) in the
lowest level of display engine was out of the question. But there
were several other reasons as well:
. get-char-code-property returns a property list in which bidi types
are recorded as symbols, while I needed them as small numeric
values of a C enumerated type (see bidi_type_t), to fit in a small
number of bits in `struct glyph'.
. The data structures manipulated by get-char-code-property include
complications (e.g., a function in the extra slot) for which I
could find no documentation, so I couldn't figure out whether it
would be possible to replace get-char-code-property by a simple
call to CHAR_TABLE_REF.
. Even if I could use CHAR_TABLE_REF, the additional call to
plist-get means more overhead. bidi_get_type, the function which
needs to look up the bidirectional type of an arbitrary character,
runs in the innermost loop of the display engine, and is called at
least once (sometimes more) for every character in the displayed
portion of the buffer, so it must be very efficient.
. For bidi-mirrored property, the data in the `mirrored' property
recorded by uni-mirrored.el is simply inadequate: the value is a
boolean (albeit in a form of symbols `Y' and `N'). What I needed
was for each character its mirrored character, if there is one;
this data was simply not available in uni-mirrored.el. The
corresponding function bidi_mirror_char is also called for a large
percentage of displayed characters, and must be efficient.
It was extremely frustrating to have all that data at my fingertips
and not be able to use it for the purposes of bidi.c, which at first
seems like a first-class client of Unicode DB. What I wanted was
something similar to C ctype macros in simplicity and efficiency, but
nothing quite like that was available. A char-table comes close, but
it must be a simple table with numerical values -- and that is what
bidi.c currently uses, leaving uni-bidi.el unused.
Re: idn.el and confusables.txt, Kenichi Handa, 2011/05/15
- Re: idn.el and confusables.txt, (continued)
- Re: idn.el and confusables.txt, Ted Zlatanov, 2011/05/14
- Re: idn.el and confusables.txt, Eli Zaretskii, 2011/05/15
- Re: idn.el and confusables.txt, Ted Zlatanov, 2011/05/15
- Re: idn.el and confusables.txt, Eli Zaretskii, 2011/05/16
- Re: idn.el and confusables.txt, Ted Zlatanov, 2011/05/16
- Re: idn.el and confusables.txt, Eli Zaretskii, 2011/05/17
- Re: idn.el and confusables.txt, Ted Zlatanov, 2011/05/17
- Re: idn.el and confusables.txt, Ted Zlatanov, 2011/05/18
- Re: idn.el and confusables.txt, Stefan Monnier, 2011/05/14