[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: mutable interfaces - was: Guile: What's wrong with this?

From: Mark H Weaver
Subject: Re: mutable interfaces - was: Guile: What's wrong with this?
Date: Sat, 07 Jan 2012 13:55:55 -0500
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/24.0.92 (gnu/linux)

Replying to myself...

> Again, I stress that this has nothing to do with Guile.  All software,
> if it wishes to be properly internationalized, needs to think about
> where a string came from.  In general, your program's source code (and
> thus the C string literals it contains) will have a different encoding
> than C strings that come from the user.  C strings of different
> encodings are essentially of different types (even though C's type
> system is too crude to distinguish them), and you must treat them as
> such.

In case it wasn't clear: Scheme strings don't have any encoding; they
are a sequence of Unicode characters.  Therefore, you never have to
think about where a Scheme string came from.  What you need to think
about is where a raw sequence of bytes came from, whether it be a C
string (C chars are not characters but merely bytes), a Scheme
bytevector, or the bytes in a command-line argument, environment
variable, or the bytes read from a file descriptor.

Ideally, our code would make these distinctions very clear.  However, if
you're not motivated (or don't have time) to fix that properly right
now, there's one fact that can save you a lot of time: on GNU/Linux and
POSIX systems, every locale encoding is compatible with ASCII.
Therefore, if you know that a string contains only ASCII characters,
then you don't need to think about whether to use scm_from_locale_string
or scm_from_utf8_string, because they'll both be equivalent.


reply via email to

[Prev in Thread] Current Thread [Next in Thread]