[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: LC_CTYPE implementation help

From: Bruno Haible
Subject: Re: LC_CTYPE implementation help
Date: Thu, 28 Aug 2008 01:31:56 +0200
User-agent: KMail/1.5.4

Aragon Gouveia wrote:
> So I take it this means that if one were writing a locale aware application,
> the application's ability to function predictability is very much upto the
> platform and system on which it runs?  ie. one can't rely on just ensuring
> gettext is installed correctly...

Yes. gettext does not replace the system's locales. If you are on a system
with broken locales, then either you have a localedef command (like on
glibc or Solaris systems), or you are hosed (that's the case on most
other systems, including *BSD, Cygwin, mingw).

> I use FreeBSD primarily

You might want to try GNU/kFreeBSD instead: a glibc system with FreeBSD
kernel - and so it supports 'localedef'.

> > And be aware that the <ctype.h> functions are meaningless in multibyte 
> > locales
> Does this apply to all systems?  I use FreeBSD primarily, and their locales
> are named, for example, "ja_JP.UTF-8" - this makes me think the FreeBSD
> ctype functions will be multibyte aware...

FreeBSD <ctype.h> are certainly multibyte aware. But isalnum() is not
sufficient for testing whether 'ü' is a lower-case or upper-case letter
because often strlen("Ü") == 2.

> edit: just noticed FreeBSD has ctype functions like iswalnum() for handling
> "wide characters" and are declared in wctype.h.  Cool! :)

Yes, mbtowc() + iswalnum() together are a working replacement for isalnum().
But I would not recommend to use functions which work on wide character
*strings* (wchar_t*) - doing so causes more problems that it solves. The
preferred representations for strings continue to be char* strings,
either in locale encoding (the default) or in UTF-8 encoding (see also
the unistr/u8* functions in gnulib).


reply via email to

[Prev in Thread] Current Thread [Next in Thread]