Re: Encoding of etc/HELLO

From: Stefan Monnier
Subject: Re: Encoding of etc/HELLO
Date: Mon, 23 Apr 2018 11:23:39 -0400
>> But along the way they discovered that it's sometimes difficult to
>> decide whether two "things" should be consider as one and the same
>> character or not.  They ended up with a set of "rules" to make those
>> decisions, but it's not nearly as simple as "each character has one and
>> only one encoding".
> Not sure what you allude to here.

For example the fact that some CJK characters should be displayed
differently depending on whether they're part of a C text, or a J text,
or a K text, so are they really "one and the same character"?

Of course, there are other related choices: which versions of β should
be one and the same and which shouldn't (e.g. I currently see in Unicode
a greek and a latin version plus some variants of a math version (tho
none in "roman" shape))?

There are murky areas, with no "one right answer", although Unicode has
had to choose somehow, i.e. doing the best it can with a messy situation.


