[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Composing Hebrew diacriticals

From: Eli Zaretskii
Subject: Re: Composing Hebrew diacriticals
Date: Fri, 14 May 2010 13:02:13 +0300

> From: Kenichi Handa <address@hidden>
> Cc: address@hidden, address@hidden, address@hidden
> Date: Fri, 14 May 2010 17:10:33 +0900
> I've just committed a fix.
> Eli, please check the comments of set_iterator_to_next, and
> verify that I'm doing the right thing.

It looks okay at a first glance, thank you!

In the HELLO buffer, the RLM character is not composed with the
following parenthesis, though.  Is this a separate problem?

I will work on the issues you raised in the comments.  For now, I have
just one response: in this fragment from set_iterator_to_next:

                /* Update IT's char/byte positions to point the first
                   character of the next grapheme cluster, or to the
                   character visually after the current composition.  */
  #if 0
                /* Is it ok to do this directly? */
                IT_CHARPOS (*it) += it->cmp_it.nchars;
                IT_BYTEPOS (*it) += it->cmp_it.nbytes;
                /* Or do we have to call bidi_get_next_char_visually
                   repeatedly (perhaps not to confuse some internal
                   state of bidi_it)?  At least we must do this if we
                   have consumed all grapheme clusters in the current
                   composition because the next character will be in the
                   different bidi level.  */
                for (i = 0; i < it->cmp_it.nchars; i++)
                  bidi_get_next_char_visually (&it->bidi_it);

the "#else" part is doing TRT.  You cannot jump to a different place
in the buffer/string behind the back of bidi_get_next_char_visually,
because that would violate the integrity of its internal cache, which
must correspond to the buffer/string positions 1:1.

> I have not yet committed proper codes for Hebrew
> composition.  I'm now testing with this simple version.
> (let ((pattern "[\u05D0-\u05F2][\u0591-\u05BF\u05C1-\u05C5\u05C7]+"))
>   (set-char-table-range
>    composition-function-table '(#x591 . #x5C7)
>    (list (vector pattern 1 'font-shape-gstring)
>        ["[\u0591-\u05C7]" 0 font-shape-gstring]))
>   (set-char-table-range
>    composition-function-table #x5C0 nil)
>   (set-char-table-range
>    composition-function-table #x5C6 nil))

Could you please look at the message I posted in
I still see the infloop, with the current trunk, even when
bidi-display-reordering is set to nil, after I type BET and DAGESH, as
described in that message.  What kind of problems in the information
that Uniscribe returns to Emacs could cause such a loop?

If I type a different diacritical after BET, like PATAH, there's no
infloop, but the display is incorrect: I see both the isolated PATAH
and the composed BAT+PATAH after it.

Jason, could you help me with this?  It looks like some
Uniscribe-specific issue.  TIA

reply via email to

[Prev in Thread] Current Thread [Next in Thread]