bug-gnu-emacs
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#24405: 24.5; Possibly ``forward-word`` doesn't respect ``word-combin


From: Eli Zaretskii
Subject: bug#24405: 24.5; Possibly ``forward-word`` doesn't respect ``word-combining-categories`` for word boundaries on changing between latin/phonetic scripts.
Date: Sat, 10 Sep 2016 13:05:09 +0300

tags 24405 + notabug
thanks

> From: Oleksandr Gavenko <gavenkoa@gmail.com>
> Date: Sat, 10 Sep 2016 11:33:45 +0300
> 
> Evaluate following form by C-x C-e:
> 
>   (let ((word-combining-categories '((?l . ?y) (?y . ?l) (?l . ?l)))
>         (word-separating-categories nil))
>     (forward-word))
> 
>   HelloПривLLжɪəʊheləʊaiɪa
> 
> My pointer stopped between ʊh.
> 
> I have:
> 
>   (aref char-script-table ?ʊ) phonetic
>   (aref char-script-table ?h) latin
>   (aref char-script-table ?ж) cyrillic
> 
>   (category-set-mnemonics (char-category-set ?ʊ)) ".Ljl"
>   (category-set-mnemonics (char-category-set ?h)) ".Lalr"
> 
>   (category-docstring ?y) "Cyrillic"
>   (category-docstring ?l) "Latin"
> 
> I expect that point moved to last character before new line.
> 
> Seems that:
> 
>   (?l . ?y) (?y . ?l)
> 
> has effect because pointer moved across Cyrillic/Latin and Cyrillic/Phonetic
> scripts but refused to move through Latin/Phonetic scripts.
> 
> If it is intended behavior how will I make Emacs to move across Latin/Phonetic
> scripts?

You can't do this for 2 characters that belong to different scripts,
but have the same categories in their category sets.  Those two
characters both have the 'l' (Latin) category in their sets, so you
cannot force Emacs to consider them not as word boundary.

For the same reason, including a cons cell whose members are
identical, such as (?l . ?l), has no effect.

This is the intended behavior, yes.  The word-combining-categories
feature is designed to support specific rare situations with mixing
the Far Eastern scripts (e.g., use of Kanji characters in Japanese
text), not for arbitrary games with Latin and European scripts.

May I ask why do you need to consider the above a single word?  In
what situation(s) does that make sense?

Thanks.





reply via email to

[Prev in Thread] Current Thread [Next in Thread]