[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Better emoji support

From: Eli Zaretskii
Subject: Re: Better emoji support
Date: Mon, 20 Sep 2021 22:42:29 +0300

> From: Robert Pluim <rpluim@gmail.com>
> Cc: kevin.legouguec@gmail.com,  emacs-devel@gnu.org
> Date: Mon, 20 Sep 2021 21:30:13 +0200
> >>>>> On Mon, 20 Sep 2021 21:54:57 +0300, Eli Zaretskii <eliz@gnu.org> said:
>     Eli> for Emoji sequences in composition-function-table should be anchored
>     Eli> on the VS-n codepoints (which I think is a good idea regardless).
>     >> 
>     >> Weʼd have to raise the lookback limit for composition-function-table
>     >> rules higher than 3 (maybe only to 4).
>     Eli> Examples?  Not that it's a catastrophe.
> >From emoji-zwj-sequences.txt:
> 1F468 1F3FB 200D 2764 FE0F 200D 1F468 1F3FB ; RGI_Emoji_ZWJ_Sequence
> ; couple with heart: man, man, light skin tone                   #
> E13.1  [1] (👨🏻‍❤️‍👨🏻)
> With the current limit you'd get no further than the 1F3FB if you
> anchored at FE0F, and miss the 1F468.

Ah, that's a misunderstanding.  I meant what I said only for sequences
that start with a non-emoji character.  When the first character is
from the emoji script, we don't need anything special to have the
right font used.

>     >> I guess it reduces the number of entries in
>     >> composition-function-table, but then you end up with a lot of rules
>     >> for eg VS-16.
>     Eli> Why do you think we need to have a lot of such rules?  What kind of
>     Eli> rules did you think about?
> For whatever reason, a lot of the sequences in emoji-zwj-sequences.txt
> contain codepoints with Emoji_Presentation = No, hence theyʼre
> followed by VS-16. As a result, anchoring to VS-16 would produces a
> lot of rules for VS-16.

We don't need a separate rule for every sequence, we can use a regular
expression with character sets.  We can even have regexps that match
more than emoji-zwj-sequences.txt specifies, since the font and the
shaping engine will sort that out and return a failure indication for
sequences that the font doesn't support.

> Anyway, we can measure the difference, if any, once we have the base
> implementation and Someone™ implements the VS-16 anchored version (it
> would only be a dozen lines of awk, I think).

Let's cross that bridge when we get to it.

reply via email to

[Prev in Thread] Current Thread [Next in Thread]