emacs-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Can watermarking Unicode text using invisible differences sneak thro


From: T.V Raman
Subject: Re: Can watermarking Unicode text using invisible differences sneak through Emacs, or can Emacs detect it?
Date: Wed, 19 Jan 2022 09:36:56 -0800
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/29.0.50 (gnu/linux)

Richard Stallman <rms@gnu.org> writes:


This is indeed  worrysome and has been around for a while. There is an
even more insidious form of this hack where unicode chars that "appear
like english letters" can be used  --and a quick visual scan will miss
it -- the trick is often used by spammers in domain-names within URLs as
 an example. As an example, there are Cyrillic letters that "look like"
 Roman letters.
 > [[[ To any NSA and FBI agents reading my email: please consider    ]]]
> [[[ whether defending the US Constitution against all enemies,     ]]]
> [[[ foreign or domestic, requires you to follow Snowden's example. ]]]
>
> There is a thread now about confusables.
>
> I read this,
>
>    Unicode allows user tracking by means of invisible text marking. Any
>    string can be converted into its binary form and then recoded into a
>    string of zero-width characters, which can then be invisibly inserted
>    into the text. If the text is posted elsewhere, the zero-width
>    character string can be extracted and the process reversed to figure
>    out the identity of the person who copied it.
>
> which seems ot be about a special case of confusables, and it makes me
> wonder whether Emacs does, or could, show users when Unicode confusion
> occurs, or prevent or fix it somehow.
>
> First, is that issue of invisible characters real?
>
> Second, does Emacs do anything now such that these tricks
> won't succeed?
>
> If the problem exists in Emacs now, could we prevent it?  I see a few
> ways to try.  I don't know whether they would work well.
>
> * Indicate the different encodings on the screen somehow.
>
> * Canonicalize such seqences (perhaps when reading text into Emacs),
> so that different encodings of the same text become identical.
>
> * Use a stand-alone canonicalizer program.

-- 

Thanks,

--Raman(I Search, I Find, I Misplace, I Research)
?7?4 Id: kg:/m/0285kf1  ?0?8



reply via email to

[Prev in Thread] Current Thread [Next in Thread]