|
From: | shalasz |
Subject: | Re: [Lynx-dev] wide-char support with different builds |
Date: | Thu, 23 Jan 2025 20:40:55 -0500 |
User-agent: | Mozilla/5.0 (Windows NT 6.0; rv:52.0) Gecko/20100101 Thunderbird/52.9.1 |
2025/01/18 10:15 ... Mouse:
There is no one-size-fits-all character set, nor encoding, nor even serialization, no matter what the priests of the UTF-8 religion would have you believe. If you want to argue that UTF-8 is the best default, that at least is worth discussing. But maintaining that there is any single "the right" character set, encoding, or serialization is...nonsense. There is, at most, "right" for a particular use case, or set of use cases.
ASCII is ugly, Latin-1 is ugly (at the defining meeting, the member from France, no printer, no linguist, no typographer, against his country s tradition repudiated the letter OE. Of course, another member jumped at that and proposed multiplication and division. There is now a hole in the added letters), Unicode is ugly. But UTF-8 is particularly ugly. It has 5 message bits and 3 overhead bits, and writers in Devanagari, ... Malayalam, ... Hangul, ... hiragana & katakana, ... above all in Chinese, find that their text files are huge, bigger than if entered in a 16-bit code (24-bit code, anyone?) with all its surrogates. In bijective binary ("Bijective numeration" in Wikipedia) using seven bits and one marker-bit, one can get up to only three bytes, and inherent uniqueness. Anyone interested in losing UTF-8 s deliberate redundancies?
[Prev in Thread] | Current Thread | [Next in Thread] |