bug-guile
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#20822: environment mangled by locale


From: Zefram
Subject: bug#20822: environment mangled by locale
Date: Fri, 4 Mar 2016 23:22:30 +0000

I wrote:
>There's an obvious parallel with reading data from an input port.
>If setlocale is called, then input is by default decoded according
>to locale, including the very lossy ASCII decode for C/POSIX.  But if
>setlocale has not been called, then input is by default decoded according
>to ISO-8859-1, preserving the actual octets.  It would probably be most
>sensible that, if setlocale hasn't been called, getenv should likewise
>decode according to ISO-8859-1.  It might also be sensible to offer
>some explicit control over the encoding to be used with the environment,
>just as I/O ports have a concept of per-port selected encoding.

In the light of what I've learned recently about Guile's locale handling,
this needs some revision.  What I thought was a well-defined "setlocale
not called" state is a mirage.  The encoding of ports is not reliably
fixed at ISO-8859-1; per bug#22910 it can be affected by ostensibly
read-only calls to setlocale, and seems to be only accidentally
ISO-8859-1 until that's done.  So that's not a good model.  Due to the
GUILE_INSTALL_LOCALE mechanism, a program wanting no locale selected
can't just never call setlocale in write mode.  So setlocale not having
been called is not really available as a way to control anything.

So it would seem to be necessary to use some explicit control of character
encoding for environment access.  (This must be control of encoding
per se, not merely of which locale to use for environment access,
because, as I noted in the original report, there's no guarantee of a
locale with a suitable encoding.)  This could be an optional parameter
to the environment access functions, or a settable variable that takes
precedence over locale to determine encoding for all environment access.
The latter would match the encoding model used by ports.

-zefram





reply via email to

[Prev in Thread] Current Thread [Next in Thread]