[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Unicode and Guile
Unicode and Guile
Tue, 21 Oct 2003 19:15:34 +0200
What's the plan on internationalization of strings in Guile?
If there is no plan, may I suggest that we move our internal
representation of strings to UTF-8. There's an interesting introductory
article written on www.joelonsoftware.com, although I don't have the
link ATM. This has the advantage that ASCII characters up to 127 are
represented the same. Of course, above that characters might take up to
eight bytes, which means that all code that processes user-input strings
has to be changed. Painful, eh? But if we hope to write apps that deal
with all languages of the world, that's the only way.
So, reactions on that would be appreciated. To make it easy, may I also
suggest that we use GLib to handle all of the unicode mess for us. This
does introduce a dependency, but libglib-2.0.so is only 400K and it's
likely to be in memory anyway on most systems. We don't need to expose
any GLib-style functions, they can all be wrapped with their scheme
Since the underlying representation can still be stored as char*, it
might be possible to make a (ice-9 unicode) library that would override
all the original bindings for character and string functions. We can
still require that the reader accept the low half of ASCII for code, so
that can stay the same. It's only dealing with strings that would be an
issue (some reader modifications required there). Then to display a
string would be a simple matter of g_locale_from_utf8 ().
My native language is English, so I don't have to deal with this problem
too much. But GNU is not just for European languages, so we should do
our best to spread the love around. Also, from working on guile-gtk, we
really need to have a comprehensive framework for internationalization,
and it sucks when C is ahead of us in this department.
- Unicode and Guile,
Andy Wingo <=