Re: [Help-smalltalk] [Q] Unicode String?

help-smalltalk

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Help-smalltalk] [Q] Unicode String?

From:	Chun Sungjin
Subject:	Re: [Help-smalltalk] [Q] Unicode String?
Date:	Fri, 7 Jul 2006 17:05:18 +0900

Hi,

main problem is that for example, if I did create an instance ofstring like this;


a := 'Some MultiByte Encoded String'.

then

a size

does not answer correct length of string.

However, I will try what you said, thank you

On Jul 7, 2006, at 4:03 PM, Paolo Bonzini wrote:

Chun Sungjin wrote:
Hi,
I've tried GNU smalltalk and for me it seems good. But I have aproblem: current implementation does not support Unicode. It seemsthat it only supports single byte character only. I've also triedsqueak, which seems less faster than GNU smalltalk - I'm not sureon this, this might not be correct - has unicode compatible stringimplementation and I think this kind of approach is good. Is thereany change to have unicode compatible string implementation innext version of GNU smalltalk?
What do you need exactly? The main missing thing is support forCharacter objects with values above 256. However if you arecontent with multibyte character sets like UTF-8, or with Unicodecharacter codes, that's fine.
For character set translation, if you load the I18N package, GNUSmalltalk gets an iconv wrapper. The main method you need isEncodedStream>>#on:from:to: (e.g. on: 'abc' from: 'UTF-8' to:'UCS-4').
To extract Unicode character codes from an UCS-4LE encoded string,you can use (ByteStream on: x asByteArray) and send nextLong. Forbig-endian, there is no class but I was thinking of adding a#bigEndian method to ByteStream for the next version.
Things that could be useful are
   Integer>>#asUTF8String
   String class>>#utf8FromCodepoint: (same as above)
   String>>#utf8Stream
   UTF8Stream (returns Unicode character codes)
   ... (tell me what you need) ...

Paolo

[Prev in Thread]

Current Thread

[Next in Thread]

[Help-smalltalk] Re: Starting with smalltalk, Paolo Bonzini, 2006/07/03
- Re: [Help-smalltalk] Re: Starting with smalltalk, Mike Anderson, 2006/07/05
  - Re: [Help-smalltalk] Re: Starting with smalltalk, Bram Neijt, 2006/07/05
    - Re: [Help-smalltalk] Re: Starting with smalltalk, Paolo Bonzini, 2006/07/06
    - Message not available
    - Re: [Help-smalltalk] Re: Starting with smalltalk, Paolo Bonzini, 2006/07/06
    - [Help-smalltalk] [Q] Unicode String?, Chun Sungjin, 2006/07/06
    - Re: [Help-smalltalk] [Q] Unicode String?, Paolo Bonzini, 2006/07/07
    - Re: [Help-smalltalk] [Q] Unicode String?, Chun Sungjin <=
    - Re: {Spam?} Re: [Help-smalltalk] [Q] Unicode String?, Paolo Bonzini, 2006/07/07
    - Why string should be collection of single byte characters? (WAS: Re: [Help-smalltalk] [Q] Unicode String?), Sungjin Chun, 2006/07/07
    - Re: {Spam?} Why string should be collection of single byte characters? (WAS: Re: [Help-smalltalk] [Q] Unicode String?), Paolo Bonzini, 2006/07/07
    - Re: {Spam?} Why string should be collection of single byte characters? (WAS: Re: [Help-smalltalk] [Q] Unicode String?), Paolo Bonzini, 2006/07/09
    - Re: {Spam?} Why string should be collection of single byte characters? (WAS: Re: [Help-smalltalk] [Q] Unicode String?), Paolo Bonzini, 2006/07/07
    - Message not available
    - Message not available
    - Re: [Help-smalltalk] Re: Starting with smalltalk, Bram Neijt, 2006/07/06

Prev by Date: Re: [Help-smalltalk] [Q] Unicode String?
Next by Date: Re: {Spam?} Re: [Help-smalltalk] [Q] Unicode String?
Previous by thread: Re: [Help-smalltalk] [Q] Unicode String?
Next by thread: Re: {Spam?} Re: [Help-smalltalk] [Q] Unicode String?
Index(es):
- Date
- Thread