[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Chicken-janitors] #1322: Locale can influence how CHICKEN reads num
From: |
Chicken Trac |
Subject: |
Re: [Chicken-janitors] #1322: Locale can influence how CHICKEN reads numbers |
Date: |
Sat, 27 Aug 2016 16:55:26 -0000 |
#1322: Locale can influence how CHICKEN reads numbers
---------------------------------------+----------------------------
Reporter: sjamaan | Owner:
Type: defect | Status: new
Priority: major | Milestone: 4.12.0
Component: core libraries | Version: 4.11.0
Resolution: | Keywords: number parsing
Estimated difficulty: hard |
---------------------------------------+----------------------------
Description changed by sjamaan:
Old description:
> Because CHICKEN uses the libc `strtol`/`strtoll` and `strtod` functions
> when reading flonums and fixnums, locale settings may influence how
> CHICKEN reads numbers, especially in `decode_literal`.
>
> Hugo Arregui provided the following simple test:
>
> {{{
> ;; Compile this with the -embedded option, since it defines its own
> main()
> (import chicken scheme foreign)
>
> #>
> #include <locale.h>
>
> int main(int argc, char** argv) {
> setlocale(LC_NUMERIC, "es_AR.UTF-8");
> CHICKEN_run(C_toplevel);
> return 0;
> }
> <#
>
> (return-to-host)
> }}}
>
> This fails because the runtime system has several encoded floating-point
> numbers, which will no longer be read correctly. Also note that `strtod`
> might incorrectly "parse" a floating-point number like `1.002` if it
> happens to be valid in the current locale using thousands separators.
>
> Parsing floating-point numbers in C is going to be pretty damn tricky, so
> we might just try and use `setlocale()` to set the locale to `C` and
> restore it to whatever it was before after doing so. I have no idea what
> the effects are of calling these functions often in the same program, and
> if there's a performance impact (it might be loading the strings or
> formatting rules for this locale every single time, on the fly, since
> it'll be designed for "normal" programs in which `setlocale()` will be
> called only a handful of times)
New description:
Because CHICKEN uses the libc `strtol`/`strtoll` and `strtod` functions
when reading flonums and fixnums, locale settings may influence how
CHICKEN reads numbers, especially in `decode_literal`.
Hugo Arregui provided the following simple test:
{{{
;; Compile this with the -embedded option, since it defines its own main()
(import chicken scheme foreign)
#>
#include <locale.h>
int main(int argc, char** argv) {
setlocale(LC_NUMERIC, "es_AR.UTF-8");
CHICKEN_run(C_toplevel);
return 0;
}
<#
(return-to-host)
}}}
This fails because the runtime system has several encoded floating-point
numbers, which will no longer be read correctly. Also note that `strtod`
might incorrectly "parse" a floating-point number like `1.002` if it
happens to be valid in the current locale using thousands separators.
Parsing floating-point numbers in C is going to be pretty damn tricky, so
we might just try and use `setlocale()` to set the locale to `C` and
restore it to whatever it was before after doing so. I have no idea what
the effects are of calling these functions often in the same program, and
if there's a performance impact (it might be loading the strings or
formatting rules for this locale every single time, on the fly, since
it'll be designed for "normal" programs in which `setlocale()` will be
called only a handful of times)
See also https://github.com/JuliaLang/julia/pull/5988 for example
--
--
Ticket URL: <https://bugs.call-cc.org/ticket/1322#comment:2>
CHICKEN Scheme <https://www.call-cc.org/>
CHICKEN Scheme is a compiler for the Scheme programming language.