[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Handle encoding of Octave strings
From: |
mmuetzel |
Subject: |
Re: Handle encoding of Octave strings |
Date: |
Sun, 15 Apr 2018 08:32:50 -0700 (MST) |
One advantage of using UTF-8 as the internal encoding would be that the
change would be less intrusive. The character matrix type could keep being
stored in an 8 bit char type.
I don't think that we need to be Matlab compatible at that level. Even if we
decided later to do so, the switch over would probably be even easier once
the interfaces for conversion are there.
We already have the conversion functions to and from UTF-8 in liboctave (via
our wrapper functions to gnulib: "octave_u8_conv_to_encoding" and
"octave_u8_conv_from_encoding"). So that means the first step is already
done.
We still need a way to determine the encoding of the strings in the m-file.
In the GUI that could be linked to the encoding setting of the editor. For
the CLI, we would need a way for setting the "source encoding". We could
probably define an interface in liboctave to set and to change the source
encoding (the default would be the system encoding).
I don't know whether it is possible to distinguish between variables being
read from m-files (any encoding possible) or defined from the command window
(always UTF-8 as it stands right now). Finding a way to distinguish that
might be another step.
Once that is done, we could apply the conversion to strings that are read
from .m files.
More or less independent from the above steps is finding places where we
would need to convert from the internal encoding (e.g. to UTF-32 for
FreeType and to UTF-16 for Qt) and actually implement that. Conversion
functions for these operations are also available in gnulib ("u8-to-u16" and
"u8-to-u32").
--
Sent from: http://octave.1599824.n4.nabble.com/Octave-Maintainers-f1638794.html