|
From: | Aaron Bentley |
Subject: | Re: [Gnu-arch-users] Re: How does arch/tla handle encodings? |
Date: | Sat, 28 Aug 2004 21:32:30 -0400 |
User-agent: | Mozilla Thunderbird 0.5 (X11/20040306) |
Marcus Sundman wrote:
Yes, and for strings with those UTF-16 is as bad as UTF-8 when it comes to random access.
That means that a general system which wants random access to codepoints must use UTF-32. Anything else is half-measures or worse.
However, if you consider à to be a character, even when it's represented as U+0061 followed by U+0300, then there's no such thing as random access to characters in Unicode. You can't change U+0061 and U+0300 into U+0062 without altering the number of codepoints in the string.
Aaron
[Prev in Thread] | Current Thread | [Next in Thread] |