[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [PATCH] fix locale string reading
From: |
Mark H Weaver |
Subject: |
Re: [PATCH] fix locale string reading |
Date: |
Mon, 14 Nov 2011 19:02:08 -0500 |
User-agent: |
Gnus/5.13 (Gnus v5.13) Emacs/24.0.90 (gnu/linux) |
Hi Ludovic!
address@hidden (Ludovic Courtès) writes:
>> I think we should consider decoding the command-line arguments using the
>> locale specified by the environment variables, at least in cases like
>> this where there's no way for the user to call setlocale before the
>> conversion happens.
>
> Below is a patch that does roughly that (we should get ‘locale_encoding’
> reviewed and perhaps added to Gnulib.)
>
> It solves the problem:
>
> # With the patch.
> $ ./meta/guile -c '(setlocale LC_ALL "en_US.UTF8")(display (command-line))'
> -- λ
> (/home/ludo/src/guile/libguile/.libs/guile -- λ)
>
> # Previously.
> $ guile -c '(setlocale LC_ALL "en_US.UTF8")(display (command-line))' -- λ
> (guile -- ??)
Looks great, thanks! :)
I have one question though. You fixed scm_compile_shell_switches, but I
see another place where command-line arguments are converted to Scheme
strings before the user is able to call setlocale: guile.c and init.c.
main (guile.c) calls scm_boot_guile (init.c), which uses
invoke_main_func (init.c), which calls scm_set_program_arguments
(feature.c). Does this code need to be fixed also?
Thanks,
Mark
> diff --git a/libguile/script.c b/libguile/script.c
> index 5e0685a..20d7b9e 100644
> --- a/libguile/script.c
> +++ b/libguile/script.c
> @@ -26,6 +26,7 @@
> #include <stdio.h>
> #include <errno.h>
> #include <ctype.h>
> +#include <uniconv.h>
>
> #include "libguile/_scm.h"
> #include "libguile/eval.h"
> @@ -368,6 +369,74 @@ scm_shell_usage (int fatal, char *message)
> : SCM_BOOL_F));
> }
>
> +/* Return the name of the locale encoding suggested by environment
> + variables, even if it's not current, or NULL if no encoding is
> + defined. Based on Gnulib's `localcharset.c'. */
> +static const char *
> +locale_encoding (void)
> +{
> + const char *locale, *codeset = NULL;
> +
> + /* Allow user to override the codeset, as set in the operating system,
> + with standard language environment variables. */
> + locale = getenv ("LC_ALL");
> + if (locale == NULL || locale[0] == '\0')
> + {
> + locale = getenv ("LC_CTYPE");
> + if (locale == NULL || locale[0] == '\0')
> + locale = getenv ("LANG");
> + }
> + if (locale != NULL && locale[0] != '\0')
> + {
> + /* If the locale name contains an encoding after the dot, return it.
> */
> + const char *dot = strchr (locale, '.');
> +
> + if (dot != NULL)
> + {
> + static char buf[2 + 10 + 1];
> + const char *modifier;
> +
> + dot++;
> + /* Look for the possible @... trailer and remove it, if any. */
> + modifier = strchr (dot, '@');
> + if (modifier == NULL)
> + return dot;
> + if (modifier - dot < sizeof (buf))
> + {
> + memcpy (buf, dot, modifier - dot);
> + buf [modifier - dot] = '\0';
> + return buf;
> + }
> + }
> +
> + /* Resolve through the charset.alias file. */
> + codeset = locale;
> + }
> +
> + return codeset;
> +}
> +
> +/* Return a list of strings from ARGV, which contains ARGC strings
> + assumed to be encoded in the current locale. Use `locale_charset'
> + instead of relying on `scm_from_locale_string' because the user
> + hasn't had a change to call (setlocale LC_ALL "") yet. */
> +static SCM
> +locale_arguments_to_string_list (int argc, char **const argv)
> +{
> + int i;
> + SCM lst;
> + const char *encoding;
> +
> + encoding = locale_encoding ();
> + for (i = argc - 1, lst = SCM_EOL;
> + i >= 0;
> + i--)
> + lst = scm_cons (scm_from_stringn (argv[i], (size_t) -1, encoding,
> + SCM_FAILED_CONVERSION_ESCAPE_SEQUENCE),
> + lst);
> +
> + return lst;
> +}
>
> /* Given an array of command-line switches, return a Scheme expression
> to carry out the actions specified by the switches.
> @@ -378,7 +447,7 @@ scm_compile_shell_switches (int argc, char **argv)
> {
> return scm_call_2 (scm_c_public_ref ("ice-9 command-line",
> "compile-shell-switches"),
> - scm_makfromstrs (argc, argv),
> + locale_arguments_to_string_list (argc, argv),
> (scm_usage_name
> ? scm_from_locale_string (scm_usage_name)
> : scm_from_latin1_string ("guile")));
- Re: [PATCH] fix locale string reading, (continued)
- Message not available
- Re: [PATCH] fix locale string reading, Nala Ginrut, 2011/11/08
- Re: [PATCH] fix locale string reading, Nala Ginrut, 2011/11/08
- Re: [PATCH] fix locale string reading, Nala Ginrut, 2011/11/08
- Re: [PATCH] fix locale string reading, Peter Brett, 2011/11/09
- Re: [PATCH] fix locale string reading, Nala Ginrut, 2011/11/09
- Re: [PATCH] fix locale string reading, Peter Brett, 2011/11/09
- Re: [PATCH] fix locale string reading, Nala Ginrut, 2011/11/09
Re: [PATCH] fix locale string reading, Ludovic Courtès, 2011/11/11