Re: Can watermarking Unicode text using invisible differences sneak thro

From: Eli Zaretskii
Subject: Re: Can watermarking Unicode text using invisible differences sneak through Emacs, or can Emacs detect it?
Date: Tue, 08 Feb 2022 14:20:53 +0200

> From: Richard Stallman <rms@gnu.org>
> Cc: psainty@orcon.net.nz, luangruo@yahoo.com,
>       kevin.legouguec@gmail.com, emacs-devel@gnu.org
> Date: Mon, 07 Feb 2022 22:55:54 -0500
>   > > I think there are only around 20 diacritics.
>   > You are thinking of some subset, I think.  The real number is more
>   > like 80,
> I am amazed.  Where can I see a list that shows more of them?

Type "C-x 8 RET COMBINING", press TAB, then filter out of the
candidates those which pertain to Cyrillic, Greek, and other specific
scripts, leaving just Latin and those which don't belong to specific

>   >   That's a great simplification from a table
>   > > of hundreds of elements, set up by hand.
>   > Setting by hand was already done, and we have it in latin1-disp.el so
> Do you mean, the table that presents a-with-breve-and-tilde as `a)?'?
> I don't think that works well.

I think it works as well as it could, but in any case, seeing all the
combinations explicitly is needed to provide reasonable results.

>   > > I don't follow you here.  In particular, what does "complete
>   > > equivalent" mean?
>   > For example, "o?'" instead of "o" + "?" + "'" (to emulate ?\ṍ).
> I don't understand the difference between "o?'" and "o" + "?" + "'".

Your proposal is to have separate rules to produce the equivalent of
each diacritic, so you will never see "o?'", only its components
separately; I denoted the latter by "o?'" and "o" + "?" + "'".

>   >   What would you do with the likes of ?\ǿ (which we currently
>   > represent as "o/'")?  Its base character, ø, doesn't have a
>   > decomposition in Unicode.
> For my terminal, I'd like it to send ø literally since my terminal
> can display that.  `ø'' would be a good way to display it.
> But on a terminal that can't display ø, `o/'' would be a good choice.

My point is that there isn't a mechanical way of producing "o/" from
ø, because Unicode decompositions don't support that.

>   > > Not on a Linux console, I think.  When I have f and i in the buffer,
>   > > Emacs does not convert them into a ligature.  The only time it has to
>   > > try to deal with a ligature is when there is a Unicode ligature
>   > > code point in the buffer.
>   > Once again, on a TTY frame Emacs does NOT produce the ligatures nor
>   > combine base characters with the diacritics.
> You have told me this several times, and I believe you.  But how does
> it relate to the case I am talking about?  I don't see a relationship.

As I said, that remark was for other people, those who will read my
email on GUI displays.

