emacs-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: emacs-27 60c84ad: ; * etc/TODO: Fix last change.


From: Eli Zaretskii
Subject: Re: emacs-27 60c84ad: ; * etc/TODO: Fix last change.
Date: Mon, 02 Mar 2020 17:25:26 +0200

> From: Robert Pluim <address@hidden>
> Cc: address@hidden
> Date: Mon, 02 Mar 2020 16:06:17 +0100
> 
>     Eli> On second thought: why do you need regexp-opt in this case?  None of
>     Eli> the other composition rules we have (search lisp/language/*.el for
>     Eli> composition-function-table) use that, so why is Emoji different?
> 
> The ones in lisp/language/*.el were presumably written by hand, unlike
> the Emoji ones.
> 
> Here╩╝s an example of the patterns we want to match for U+1F3C3 (there
> are longer ones):
> 
>  "\N{U+1F3C3}\N{U+200D}\N{U+2640}\N{U+FE0F}"
>  "\N{U+1F3C3}\N{U+200D}\N{U+2642}\N{U+FE0F}"
>  "\N{U+1F3C3}\N{U+1F3FB}\N{U+200D}\N{U+2640}\N{U+FE0F}"
>  "\N{U+1F3C3}\N{U+1F3FB}\N{U+200D}\N{U+2642}\N{U+FE0F}"
>  "\N{U+1F3C3}\N{U+1F3FC}\N{U+200D}\N{U+2640}\N{U+FE0F}"
>  "\N{U+1F3C3}\N{U+1F3FC}\N{U+200D}\N{U+2642}\N{U+FE0F}"
>  "\N{U+1F3C3}\N{U+1F3FD}\N{U+200D}\N{U+2640}\N{U+FE0F}"
>  "\N{U+1F3C3}\N{U+1F3FD}\N{U+200D}\N{U+2642}\N{U+FE0F}"
>  "\N{U+1F3C3}\N{U+1F3FE}\N{U+200D}\N{U+2640}\N{U+FE0F}"
>  "\N{U+1F3C3}\N{U+1F3FE}\N{U+200D}\N{U+2642}\N{U+FE0F}"
>  "\N{U+1F3C3}\N{U+1F3FF}\N{U+200D}\N{U+2640}\N{U+FE0F}"
>  "\N{U+1F3C3}\N{U+1F3FF}\N{U+200D}\N{U+2642}\N{U+FE0F}"
> 
> Now we could add 12 rules here, one for each pattern, or 1 rule with
> all the patterns as alternatives, or we could run regexp-opt and add
> one optimized pattern.

If this is easier done by hand, maybe we should just do that.  We
could instead have an automated way of _checking_ the patterns against
emoji-*.txt files and flagging the new ones to add.  After all,
Unicode files don't change too frequently, and we already use similar
practices with other Unicode data files we import, see
admin/notes/unicode.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]