[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Unicode support in io Forge package

From: Markus Mützel
Subject: Re: Unicode support in io Forge package
Date: Sat, 19 Oct 2019 08:16:42 +0200


Iirc, the interface uses UTF-16. The conversion function really only works for Latin-1 encoded input.
There really is no UTF-8 in this. TBH, I chose the name before I got a sufficient grasp of that encoding mess.

Forge packages usually target a wider range of Octave versions. I don't know whether this workaround can be safely removed without loosing support for Latin-1 in older Octave versions supported by io.
I didn't re-read the code. But believe that "unicode2native" is used if it is available.


PS: Sorry for top-posting. My mobile phone app doesn't allow otherwise.
Diese Nachricht wurde von meinem Android Mobiltelefon mit GMX Mail gesendet.
Am 19.10.19, 07:04, Andrew Janke <address@hidden> schrieb:
Hi, Octave and io maintainers,

I'm confused by the Unicode support in the io package. In particular,
the functions unicode2utf8 and utf82unicode, and the "encode_utf"
options in some of the ods/xls read/write functions.

What is the encoding that utf82unicode/unicode2utf8 are calling
"unicode" here? It looks like it's doing a single-byte encoding,
treating each byte as an unsigned int 0-255, and treating those 0-255
values directly as Unicode code point values. That's not any of the
standard Unicode encodings. (But I think it is exactly the same as
Latin-1/ISO 8859-1.)

As I understand it, since about Octave 4.4, Octave's internal encoding
(that is, how it interprets Octave char arrays) is either UTF-8 or an
opaque array of bytes; it's never in the "system code page" or some
other locale-specific encoding.

Is this UTF-8 support in io still relevant/correct? Maybe it should be
deprecated or renamed/removed? Since Octave now supports UTF-8, I think
you'd want to just leave UTF-8 text as is in all cases.


reply via email to

[Prev in Thread] Current Thread [Next in Thread]