[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: need: scm_from_{utf8,latin1}_{string,symbol,keyword}

From: Ludovic Courtès
Subject: Re: need: scm_from_{utf8,latin1}_{string,symbol,keyword}
Date: Mon, 06 Sep 2010 19:02:00 +0200
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/23.2 (gnu/linux)


Andy Wingo <address@hidden> writes:

> However, when we have literals in C source code, I think this strategy
> is incorrect. I write my C source code in UTF-8 or in ISO-8859-1, but if
> the user is running in another locale, they will not load my
> strings/symbols/keywords correctly.

Actually locale encodings are typically ASCII-compatible (info
"(libunistring) Locale encodings"), so it’s rarely (never?) a problem in

> The solution is to use functions that specify the locale. We don't have
> those yet, but we do have the capability to write them
> now. Specifically:
>   scm_from_utf8_string
>   scm_from_utf8_symbol
>   scm_from_utf8_keyword
>   scm_from_latin1_string
>   scm_from_latin1_symbol
>   scm_from_latin1_keyword

The ‘latin1’ family should be easy to implement and that’s what we’d use
in our C code.


> For example, most GLib-based libraries expect utf-8 strings, but
> Guile-GNOME ignorantly passes them the result of calling
> scm_to_locale_string. Though this will work in UTF-8 locales, it's only
> by accident.

When using (system foreign), one can use:

  (bytevector->pointer (string->utf8 "foo"))

or similar.

Besides, there’s the undocumented ‘scm_from_stringn’ and the internal
‘scm_to_stringn’, which can convert from/to any encoding.  I think they
were initially kept internal because we weren’t quite sure about the
API.  Mike?

Perhaps it’d be enough to make these two functions public and
documented, and add the ‘latin1’ family?


reply via email to

[Prev in Thread] Current Thread [Next in Thread]