guile-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Internal visibility


From: Mike Gran
Subject: Re: Internal visibility
Date: Thu, 12 Jun 2008 20:45:25 +0000 (UTC)
User-agent: Loom/3.14 (http://gmane.org/)

Ludovic Courtès <ludo <at> gnu.org> writes:

> Yes, that's probably a good idea.  At any rate, we only have
> `scm_to_locale_string ()' currently so it's not too late to add a single
> function with an encoding parameter in lieu of the proposed
> `scm_to_{utf8,utf16,utf32,ucs4,...}_string ()'.
> 
> But first of all, one needs to implement Unicode support.  

FWIW, I have a complete unicode support library for Guile called GuICU.  It 
lives at http://gano.sourceforge.net.  It works for me, but, hasn't been 
widely tested.

It is built on the large and cumbersome IBM ICU library.  ICU encodes things 
internally as UTF16, which I always though of as a poor idea, since neither 
allows O(1) seeking of individual codepoints nor works so well with UTF-8.

Based on my experience with ICU and putting this library together, and looking 
at what r6rs claims should be the future for Unicode, I really do think that 
UTF-32 is the way to go. 

Alternately, one could build a string library where strings are represented as 
either u8 or u32 vectors.  If a string function is asked to operate on a u32 
vector, it will assume a UTF32 encoding.  If a string function is asked to 
operate on a u8 vector it will either require a locale or, as a fallback, 
treat the string as a raw byte vector.

This would be twice the work to implement, though.






reply via email to

[Prev in Thread] Current Thread [Next in Thread]