octave-bug-tracker
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Octave-bug-tracker] [bug #54170] java.lang.String.toCharArray result in


From: Andrew Janke
Subject: [Octave-bug-tracker] [bug #54170] java.lang.String.toCharArray result incorrect conversion to char matrix
Date: Mon, 25 Jun 2018 06:11:25 -0400 (EDT)
User-agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_13_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/67.0.3396.87 Safari/537.36

Follow-up Comment #13, bug #54170 (project octave):

> What is the result for class(cs2{1}) in your example in comment #10?

It is 'char'. In Matlab, Java `char[]` arrays always automatically convert to
Matlab `char`.


>> cs = {'foo', 'föö', 'foobar'}';
>> cs2 = cs; for i = 1:numel(cs); cs2{i} =
java.lang.String(cs{i}).toCharArray'; end
>> class(cs2{1})
ans =
    'char'


> We can still think about what calling "char" on a Java char array should
do...

Yeah, it's a good question.

If you want full Matlab compatibility, I think there's only one thing to do:
convert to a UTF-16-compatible native Octave type. In Matlab, there is no way
to even call "char(...)" on a Java char[] array, because Java char[] arrays
are always implicitly autoconverted to Matlab `char` arrays, so Java char[]
objects are never exposed at the M-code level. (IMHO, this might be another
Matlab design flaw, because it breaks compatibility with Java APIs that accept
pass-by-reference Java char[] arrays to be used as buffers to fill and use as
output parameters, e.g. in I/O libraries, but what's done is done.)

> It might lead to even more confusion if we introduced a special char class
just for UTF-16 (and maybe another one for UTF-32?). 

Hmm. Might not, depending on where it was used, because most users might not
ever see it, if it were just for Java compatibility. Java APIs don't use
`char[]` except for low level I/O buffer passing, and for storing passwords.
In Java, if you are working directly with `char[]`, it is probably a design
mistake and you should be working with java.lang.String instead.

Then again, it would probably need to exist somewhere for Matlab
compatibility, too. And yeah, that would probably lead to confusion. (Though
I'd argue that Octave's current use of `char` as bytes in a Unicode world is
itself probably a source of confusion.)

I wonder why the original poster in this discussion was calling toCharArray()
in the first place. @liangtang, can you chime in? Do you have an actual use
case for this, or were you just exploring what these types do?

    _______________________________________________________

Reply to this item at:

  <http://savannah.gnu.org/bugs/?54170>

_______________________________________________
  Message sent via Savannah
  https://savannah.gnu.org/




reply via email to

[Prev in Thread] Current Thread [Next in Thread]