[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [emacs-bidi] Re: Arabic support

From: Amit Aronovitch
Subject: Re: [emacs-bidi] Re: Arabic support
Date: Sat, 28 Aug 2010 13:15:44 +0300

On Fri, Aug 27, 2010 at 12:56 PM, Eli Zaretskii <address@hidden> wrote:
> From: Kenichi Handa <address@hidden>
> Date: Thu, 26 Aug 2010 10:10:05 +0900
> I've just committed changes to trunk for Arabic shaping.  If
> there're any Arabic users in this list, please check the
> displaying of Arabic text.  On GNU/Linux system, you must
> compile Emacs with libotf and m17n-lib (configure script
> should detect them automatically).

Thanks.  However, today's build behaves very strangely in a GUI
session on MS-Windows.  For starters, cursor motion seems to jump
across many characters in the "Arabic" line of etc/HELLO.  For
example, typing C-f in that line, I first move one character at a time
across "Arabic", as expected, then the cursor jumps to the right paren
of the leftmost parenthesized part, again as expected, and then I see
the following strange behavior:

 . C-f moves one character to the left, to buffer position 758, as

 . the next C-f jumps across many characters on the screen and lands
   on position 764.

 . another C-f jumps to what is reported as position 765, but on the
   screen those are several characters, maybe 5 or 6.

 . another C-f moves to the left paren at position 766, as expected.

 . yet another C-f moves to position 767, but on the screen the
   cursor jumps back into one of the characters it jumped across when
   it landed on position 765 two C-f keypresses earlier.

 . if I type C-b 4 times from this point, I enter a "trap", whereby
   typing C-b jumps between two characters, whose buffer positions
   are 764 and 765.  The only way to get out of the trap is with C-a
   or C-e or C-f.

I don't read Arabic, so I cannot really say whether any of this is
expected behavior.  (The "trap" with C-b is certainly not the expected
behavior.)  Do you see anything similar on X?

1) I confirm that Arabic shaping seems to work fine on my build (27/8/10 rev. 101200, on Linux+X (Debian unstable)).

2) Logical movement with C-f/C-b in the hello file seems fine (I do not see the trap described above).

3) My Arabic is very basic, and I am not familiar with Arabic computing (keyboards etc.) - I noticed the following points, but I am not sure what is the expected behavior (I can only compare to other programs - gedit in this case):

  a) Column numbers (column-number-mode) behave strangely (I suspect that m17n-lib's invisible markup consume column numbers). For example as you move using C-f in the word "هذا" column numbers go through "0,1,4,5" (i.e. the second character takes up 3 columns). If I change that to "بهذا", the column positions are "0,1,4,6,7" (the second and third chars take up 3 and 2 columns resp.?).
  In gedit column positions are 1 character per column and do not depend on the shaping.

  b) Arabic keyboard has the ligature "Lam-Alef" (U+FEFB) on the key marked "B" in qwerty keyboards. When I type this in emacs, I get Lam and Alef (which are auto-shaped correctly as the proper ligature). C-d when cursor is on the ligature erases the Alef and another C-d erases the Lam. This seems like proper behavior to me. However, in gedit, the "B" key produces a (U+FEFB) which is always displayed as a ligature, deleted in a single Del press, and never connected to previous character. Cut and pasting this into emacs, I get a similar behavior there.
The question is: do Arabic users expect to be able to produce this "stiff" ligature? Is the behavior of gedit a bug? Should the emacs "Lam-Alef" key behave as it does (i.e. produce two characters)?

      Amit Aronovitch

reply via email to

[Prev in Thread] Current Thread [Next in Thread]