[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: mutable interfaces - was: Guile: What's wrong with this?

From: Mark H Weaver
Subject: Re: mutable interfaces - was: Guile: What's wrong with this?
Date: Sat, 07 Jan 2012 13:30:33 -0500
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/24.0.92 (gnu/linux)

Bruce Korb <address@hidden> writes:

> On 01/07/12 08:13, Mark H Weaver wrote:
>>> Most of the strings that I wind up altering are created with a
>>> scm_from_locale_string() C function call.
>> BTW, beware that scm_from_locale_string() is only appropriate for
>> strings that came from the user (e.g. command-line arguments, reading
>> from a port, etc).  When converting string literals from your own source
>> code, you should use scm_from_latin1_string() or scm_from_utf8_string().
>> Similarly, to make symbols from C string literals, use
>> scm_from_latin1_symbol() or scm_from_utf8_symbol().
>> Caveat: these functions did not exist in Guile 1.8.  If your C string
>> literals are ASCII-only, I guess it won't matter in practice which
>> function you use, although it would be good to spread the understanding
>> that C string literals should not be interpreted according to the user's
>> locale.
> I go back to my argument that a facilitation language needs to focus
> on being as helpful as possible.  That means doing what is likely
> wanted instead of throwing errors at every possibility.  It also means
> not changing interfaces.

Sorry, but there's no way to maintain backward compatibility here.  I
know it's a pain, but there's no getting around the fact that in order
to write proper internationalized code, we now need to think carefully
about what encoding a particular string is in.  There's no automatic way
to handle this, not even in principle.

Fortunately, most modern GNU/Linux systems default to a UTF-8 locale, in
which case scm_from_locale_string and scm_from_utf8_string will be the
same anyway.  However, there are still some systems that use a non-UTF-8
locale, and we must strive to support them properly.

> Anyway, this then?  (abbreviated)
> #if   GUILE_VERSION < 107000
> # define AG_SCM_STR02SCM(_s)          scm_makfrom0str(_s)
> # define AG_SCM_STR2SCM(_st,_sz)      scm_mem2string(_st,_sz)
> #elif   GUILE_VERSION < 200000
> # define AG_SCM_STR02SCM(_s)          scm_from_locale_string(_s)
> # define AG_SCM_STR2SCM(_st,_sz)      scm_from_locale_stringn(_st,_sz)
> #elif   GUILE_VERSION < 200004
> #error "autogen does not work with this version of guile"
>   choke me.

This last clause is wrong.  scm_from_utf8_string and
scm_from_utf8_stringn were in Guile 2.0.0.

> #else
> # define AG_SCM_STR02SCM(_s)          scm_from_utf8_string(_s)
> # define AG_SCM_STR2SCM(_st,_sz)      scm_from_utf8_stringn(_st,_sz)
> #endif

Just remember that this change implies that these macros should only be
used for C string literals, and must _not_ be used for strings supplied
by the user (e.g. command-line arguments and I/O).

It could very well be that you're currently overloading these functions
for both purposes, in which case you should split this pair of macros
into two distinct pairs: one pair of macros for user strings (keep using
scm_from_locale_string{,n} for these), and one pair for C string
literals (use scm_from_utf8_string{,n} for Guile 2.0.0 or newer).

Then look at each use of these old overloaded macros in your code, and
figure out whether it's operating on a string that came from the user or
a string that came from your own source code.

Again, I stress that this has nothing to do with Guile.  All software,
if it wishes to be properly internationalized, needs to think about
where a string came from.  In general, your program's source code (and
thus the C string literals it contains) will have a different encoding
than C strings that come from the user.  C strings of different
encodings are essentially of different types (even though C's type
system is too crude to distinguish them), and you must treat them as


reply via email to

[Prev in Thread] Current Thread [Next in Thread]