[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Better emoji support

From: Robert Pluim
Subject: Re: Better emoji support
Date: Mon, 20 Sep 2021 22:05:10 +0200

>>>>> On Mon, 20 Sep 2021 22:42:29 +0300, Eli Zaretskii <eliz@gnu.org> said:

    >> From: Robert Pluim <rpluim@gmail.com>
    >> Cc: kevin.legouguec@gmail.com,  emacs-devel@gnu.org
    >> Date: Mon, 20 Sep 2021 21:30:13 +0200
    >> >>>>> On Mon, 20 Sep 2021 21:54:57 +0300, Eli Zaretskii <eliz@gnu.org> 
    Eli> for Emoji sequences in composition-function-table should be anchored
    Eli> on the VS-n codepoints (which I think is a good idea regardless).
    >> >> 
    >> >> Weʼd have to raise the lookback limit for composition-function-table
    >> >> rules higher than 3 (maybe only to 4).
    Eli> Examples?  Not that it's a catastrophe.
    >> >From emoji-zwj-sequences.txt:
    >> 1F468 1F3FB 200D 2764 FE0F 200D 1F468 1F3FB ; RGI_Emoji_ZWJ_Sequence
    >> ; couple with heart: man, man, light skin tone                   #
    >> E13.1  [1] (👨🏻‍❤️‍👨🏻)
    >> With the current limit you'd get no further than the 1F3FB if you
    >> anchored at FE0F, and miss the 1F468.

    Eli> Ah, that's a misunderstanding.  I meant what I said only for sequences
    Eli> that start with a non-emoji character.  When the first character is
    Eli> from the emoji script, we don't need anything special to have the
    Eli> right font used.

Phew. Let's talk about en/de-coding next, fun for all the family :-)

    >> >> I guess it reduces the number of entries in
    >> >> composition-function-table, but then you end up with a lot of rules
    >> >> for eg VS-16.
    Eli> Why do you think we need to have a lot of such rules?  What kind of
    Eli> rules did you think about?
    >> For whatever reason, a lot of the sequences in emoji-zwj-sequences.txt
    >> contain codepoints with Emoji_Presentation = No, hence theyʼre
    >> followed by VS-16. As a result, anchoring to VS-16 would produces a
    >> lot of rules for VS-16.

    Eli> We don't need a separate rule for every sequence, we can use a regular
    Eli> expression with character sets.  We can even have regexps that match
    Eli> more than emoji-zwj-sequences.txt specifies, since the font and the
    Eli> shaping engine will sort that out and return a failure indication for
    Eli> sequences that the font doesn't support.


    >> Anyway, we can measure the difference, if any, once we have the base
    >> implementation and Someone™ implements the VS-16 anchored version (it
    >> would only be a dozen lines of awk, I think).

    Eli> Let's cross that bridge when we get to it.

Right. For now we key off the first character in the sequence
speficied in emoji-zwj-sequences.txt.


reply via email to

[Prev in Thread] Current Thread [Next in Thread]