[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: idn.el and confusables.txt

From: Eli Zaretskii
Subject: Re: idn.el and confusables.txt
Date: Sun, 15 May 2011 20:34:55 +0300

> From: Kenichi Handa <address@hidden>
> Cc: address@hidden, address@hidden, address@hidden
> Date: Sun, 15 May 2011 22:06:23 +0900
> In article <address@hidden>, Eli Zaretskii <address@hidden> writes:
> > You see, the uni-*.el files we create out of the Unicode DB are not
> > used anywhere in application code, AFAIK.  We use them to display
> > character properties in the likes of "C-u C-x =", and that's it.
> composite.el uses `general-category' and `canonical-combining-class'.
> ucs-normalize.el uses `decomposition' and `canonical-combining-class'.
> mule-cmds.el uses `name' and `old-name' for read-char-by-name.

Are functions defined by ucs-normalize.el used anywhere?

> Why did you have to create another table?  Was it because
> get-char-code-property is defiend by Lisp and not efficient
> to call from C?

Yes, calling a Lisp function (one that calls `load' at that!) in the
lowest level of display engine was out of the question.  But there
were several other reasons as well:

  . get-char-code-property returns a property list in which bidi types
    are recorded as symbols, while I needed them as small numeric
    values of a C enumerated type (see bidi_type_t), to fit in a small
    number of bits in `struct glyph'.

  . The data structures manipulated by get-char-code-property include
    complications (e.g., a function in the extra slot) for which I
    could find no documentation, so I couldn't figure out whether it
    would be possible to replace get-char-code-property by a simple
    call to CHAR_TABLE_REF.

  . Even if I could use CHAR_TABLE_REF, the additional call to
    plist-get means more overhead.  bidi_get_type, the function which
    needs to look up the bidirectional type of an arbitrary character,
    runs in the innermost loop of the display engine, and is called at
    least once (sometimes more) for every character in the displayed
    portion of the buffer, so it must be very efficient.

  . For bidi-mirrored property, the data in the `mirrored' property
    recorded by uni-mirrored.el is simply inadequate: the value is a
    boolean (albeit in a form of symbols `Y' and `N').  What I needed
    was for each character its mirrored character, if there is one;
    this data was simply not available in uni-mirrored.el.  The
    corresponding function bidi_mirror_char is also called for a large
    percentage of displayed characters, and must be efficient.

It was extremely frustrating to have all that data at my fingertips
and not be able to use it for the purposes of bidi.c, which at first
seems like a first-class client of Unicode DB.  What I wanted was
something similar to C ctype macros in simplicity and efficiency, but
nothing quite like that was available.  A char-table comes close, but
it must be a simple table with numerical values -- and that is what
bidi.c currently uses, leaving uni-bidi.el unused.

reply via email to

[Prev in Thread] Current Thread [Next in Thread]