Endianness-aware UTF conversion

bug-gnulib

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Endianness-aware UTF conversion

From:	Ludovic Courtès
Subject:	Endianness-aware UTF conversion
Date:	Sun, 07 Oct 2007 19:33:25 +0200
User-agent:	Gnus/5.11 (Gnus v5.11) Emacs/22.1 (gnu/linux)

Hi Bruno,

Bruno Haible <address@hidden> writes:

> Therefore I would recommend to use the mem_cd_iconveh function from the
> 'striconveh' module, with FROMCODE = locale_charset() and TOCODE =
> "UTF-16BE" or "UTF-16LE" (or vice versa). Or mem_iconveh you don't
> want to reuse the conversion descriptors.

That's what I was going to do, but the excerpt of `u16-conv-from-enc.c'
that I quoted made me think that, e.g., "UTF-16BE" and "UTF-16LE" were
only known to work on Glibc >= 2.2:

  /* Name of UTF-16 encoding with machine dependent endianness and alignment.  
*/
  #if defined _LIBICONV_VERSION || (__GLIBC__ > 2) || (__GLIBC__ == 2 && 
__GLIBC_MINOR__ >= 2)
  # ifdef WORDS_BIGENDIAN
  #  define UTF16_NAME "UTF-16BE"
  # else
  #  define UTF16_NAME "UTF-16LE"
  # endif
  #endif

Likewise, `u-conv-from-enc.h' contains alternate code for systems where
`UTF_NAME' is undefined (i.e., typically on non-GNU systems).

Therefore, `mem_iconveh ()' doesn't seem appropriate since there is
AFAIUI no portable way to specify, say, "UTF-16{LE,BE}" as TO_CODESET...
which led me to suggest that we might have to provide endianness-aware
conversion procedures.

Does it clarify things a bit?  :-)

Thanks,
Ludovic.

[Prev in Thread]

Current Thread

[Next in Thread]

Endianness-specific, Ludovic Courtès, 2007/10/06
- Re: Endianness-specific, Bruno Haible, 2007/10/06
  - Endianness-aware UTF conversion, Ludovic Courtès <=
    - new module iconv_open-utf (was: Re: Endianness-aware UTF conversion), Bruno Haible, 2007/10/14
    - Re: new module iconv_open-utf, Ludovic Courtès, 2007/10/15

Prev by Date: Re: switch to (L)GPLv3
Next by Date: Re: Fwd: Re: error.c: "Unknown system error" should report errno value
Previous by thread: Re: Endianness-specific
Next by thread: new module iconv_open-utf (was: Re: Endianness-aware UTF conversion)
Index(es):
- Date
- Thread