|
From: | Paolo Bonzini |
Subject: | Re: {Spam?} Re: [Help-smalltalk] [Q] Unicode String? |
Date: | Fri, 07 Jul 2006 11:17:07 +0200 |
User-agent: | Thunderbird 1.5.0.4 (Macintosh/20060530) |
Chun Sungjin wrote:
Well, strlen does not in C, too. You need mbrlen, and #size is more like strlen than mbrlen.Hi,main problem is that for example, if I did create an instance of string like this;a := 'Some MultiByte Encoded String'. then a size does not answer correct length of string.
Also, the result heavily depends on the chosen character set. If we want to have #utf8Size, that's fine. But #size should be the number of *bytes*, not of characters.
I'm seeing now if I can add an EncodedStream method that extracts Unicode characters. Then what you wanted would be something like
(EncodedStream wordsOn: 'some string') contents size for which, of course, we can add a utility method. Paolo
[Prev in Thread] | Current Thread | [Next in Thread] |