|
From: | Gregory Heytings |
Subject: | Re: Unicode confusables and reordering characters considered harmful |
Date: | Wed, 03 Nov 2021 11:31:37 +0000 |
There's some data that shows that this is extremely rare in general: the Rust Security Response WG analyzed the 70322 crates and found only 5 in which these codepoints were present (see [1]). That's ~0.01 %.Moreover such highlighting does not make the source code or text unreadable, even in those few legitimate cases.Depending on how you define it, there is at least one major world language (Arabic) that has a RTL script, and other major languages such as Urdu, Farsi and Hebrew also use it (and a couple of others too). So I think we should consider to what extent your proposal might hurt users of such languages.Are these characters important to write comments and strings in any of those languages? Will your proposal make it harder to type in such languages? If yes, are there less invasive solutions?
Thanks for your comments!AFAIK, these specific characters are not necessary to write comments and strings in these languages. Here are two random file which use RTL strings and comments, and in which these characters are not used:
https://raw.githubusercontent.com/01walid/goarabic/master/stringutils_test.go https://raw.githubusercontent.com/AbdullahDiaa/garabic/main/garabic.go
[Prev in Thread] | Current Thread | [Next in Thread] |