[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: More i18n
Re: More i18n
Tue, 12 Dec 2006 10:36:31 +0100
Gnus/5.110006 (No Gnus v0.6) Emacs/21.4 (gnu/linux)
Kevin Ryde <address@hidden> writes:
> address@hidden (Ludovic Courtès) writes:
>> In order to be consistent with the rest of `(ice-9 i18n)',
>> `language-information' accepts an optional argument which should be a
>> locale object. Consequently, `language-information' has to perform
>> appropriate charset conversion.
> If you ask for something from a particular locale object, shouldn't
> you get the charset of that object? That'd be what I'd expect.
My understanding is that, implicitly, Guile currently uses the current
locale's encoding as its "internal representation". For instance,
`scm_from_locale_string ()' stores the input string "as is", i.e., in
the current locale's encoding. Thus, I think it makes sense to remain
consistent with this view.
When we have a non-8-bit internal representation, conversion will
probably be needed in any case.
> AM_ICONV (from gettext, see its manual) is good for that detection, it
> can cope with various oddities.
> DAY_1 smells a lot like C, surely something more schemely is possible?
> After all, guile isn't a C interpreter with parenthesized syntax! :-)
Agreed, but changing the name makes it harder to automate the definition
> You should setup to use localeconv and strftime too, if nl_langinfo
> isn't available, that'd cope with DOS systems. Or if you want to tie
> only to nl_langinfo then calling the function nl-langinfo will make it
> clearest what's being had.
Are there many systems that provide `localeconv' but not `nl_langinfo'?
I guess Windows has the former but not the latter, in which case it
might make sense to support it. That would require a bit of work,
> Though I suspect there's quite a few places in existing code that may
> leak resources of one type or another under out of memory errors. And
> I doubt out of memory is even recoverable at all if genuinely almost
Yes, out-of-memory is currently mostly handled by `abort ()' (e.g.,
> Incidentally, if you're thinking about iconv (which I suspect is not
> needed yet), then I've found it useful to also have conversions that
> put in dummies for untranslatable chars. With glibc the charset names
> "foo//TRANSLIT" etc give that effect, but elsewhere you have to watch
> for EILSEQ and skip that char. Good for output to at least display
The issue is that (i) it is glibc-specific and (ii) it is not a desired
behavior in many cases.
>> (define priv:locale-abbr-weekday-vector
>> - (vector "Sun" "Mon" "Tue" "Wed" "Thu" "Fri" "Sat"))
>> + (cond-feature (language-information
>> + (vector ABDAY_1 ABDAY_2 ABDAY_3
>> + ABDAY_4 ABDAY_5 ABDAY_6 ABDAY_7))
> This raises a point about the interface. It should be possible to ask
> for "day number N". If ABDAY and friends aren't consecutive then they
> should be.
Right. I guess we could provide a higher-level API, along the lines of:
(language-information AB_DAY 1)
... that would use the same kind of vector internally to map day numbers
to `nl_langinfo' constants. This could be implemented in Scheme.
So, maybe we should rename `language-information' to `nl-langinfo' and
keep the C code as simple as possible, while leaving room for a
higher-level implementation that would possibly deserve the name of