bug-gnulib
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Endianness-aware UTF conversion


From: Ludovic Courtès
Subject: Endianness-aware UTF conversion
Date: Sun, 07 Oct 2007 19:33:25 +0200
User-agent: Gnus/5.11 (Gnus v5.11) Emacs/22.1 (gnu/linux)

Hi Bruno,

Bruno Haible <address@hidden> writes:

> Therefore I would recommend to use the mem_cd_iconveh function from the
> 'striconveh' module, with FROMCODE = locale_charset() and TOCODE =
> "UTF-16BE" or "UTF-16LE" (or vice versa). Or mem_iconveh you don't
> want to reuse the conversion descriptors.

That's what I was going to do, but the excerpt of `u16-conv-from-enc.c'
that I quoted made me think that, e.g., "UTF-16BE" and "UTF-16LE" were
only known to work on Glibc >= 2.2:

  /* Name of UTF-16 encoding with machine dependent endianness and alignment.  
*/
  #if defined _LIBICONV_VERSION || (__GLIBC__ > 2) || (__GLIBC__ == 2 && 
__GLIBC_MINOR__ >= 2)
  # ifdef WORDS_BIGENDIAN
  #  define UTF16_NAME "UTF-16BE"
  # else
  #  define UTF16_NAME "UTF-16LE"
  # endif
  #endif

Likewise, `u-conv-from-enc.h' contains alternate code for systems where
`UTF_NAME' is undefined (i.e., typically on non-GNU systems).

Therefore, `mem_iconveh ()' doesn't seem appropriate since there is
AFAIUI no portable way to specify, say, "UTF-16{LE,BE}" as TO_CODESET...
which led me to suggest that we might have to provide endianness-aware
conversion procedures.

Does it clarify things a bit?  :-)

Thanks,
Ludovic.





reply via email to

[Prev in Thread] Current Thread [Next in Thread]