|Subject:||Re: [Gnu-arch-users] Re: How does arch/tla handle encodings?|
|Date:||Sat, 28 Aug 2004 21:32:30 -0400|
|User-agent:||Mozilla Thunderbird 0.5 (X11/20040306)|
Marcus Sundman wrote:
Yes, and for strings with those UTF-16 is as bad as UTF-8 when it comes to random access.
That means that a general system which wants random access to codepoints must use UTF-32. Anything else is half-measures or worse.
However, if you consider à to be a character, even when it's represented as U+0061 followed by U+0300, then there's no such thing as random access to characters in Unicode. You can't change U+0061 and U+0300 into U+0062 without altering the number of codepoints in the string.
|[Prev in Thread]||Current Thread||[Next in Thread]|