bug-gnu-libiconv
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[bug-gnu-libiconv] Generating utf-16 with BOM and specified endianness?


From: Keith Thompson
Subject: [bug-gnu-libiconv] Generating utf-16 with BOM and specified endianness?
Date: Mon, 23 Jan 2012 15:43:47 -0800

The "iconv" command supports "utf-16", "utf-16be", and "utf-16le"
formats (among many many others).

For output (e.g., "echo hello | iconv -f ascii -t utf-16be"), the
utf-16be format is big-endian UTF-16 with no BOM (byte order mark);
similarly, utf-16le is little-endian UTF-16 with no BOM.

The "-t utf-16" option causes iconv to generate UTF-16 output *with*
a BOM -- but the endianness is unspecified.  A few experiments seem
to indicate that the generated UTF-16 uses the same endianness as
the current system, but I've seen one report (which I'm trying to
verify) of it generating big-endian output on a little-endian system
(x86 Mac OSX).

There doesn't seem to be a way to tell iconv to generate UTF-16
output with a BOM with a specified endianness.  (For example, the
preferred format for Unicode text on MS Windows is little-endian
UTF-16 with BOM).

There are workarounds, such as prepending the BOM manually, or using
"-t utf-16" and filtering the output through "dd conv=swab", but
it would be nice to be able to just specify the format directly.
Suggested syntax:

    iconv -f ... -t utf-16bebom
    iconv -f ... -t utf-16lebom

See also this question that I posted on superuser.com:
http://superuser.com/questions/381056/iconv-generating-utf-16-with-bom

If there's interest in this feature, but insufficient time to
implement it, I'd consider implementing it myself and submitting
a patch.

Another question: There seem to be two separate "iconv" commands.
/usr/bin/iconv on my Ubuntu 11.04 system is part of the libc-bin
package, which seems to be distinct from, but similar to, the
version provided by the libiconv package.  What is the relationship
between them?  Would changes made to one of them show up (eventually)
in the other?

Thanks.

--
Keith Thompson <address@hidden>



reply via email to

[Prev in Thread] Current Thread [Next in Thread]