|
From: | Ben Abbott |
Subject: | Re: [changeset] Asian Characters and strchr() |
Date: | Wed, 11 Mar 2009 20:06:46 +0800 |
On Mar 11, 2009, at 5:30 PM, Jaroslav Hajek wrote:
On Wed, Mar 11, 2009 at 9:14 AM, Ben Abbott <address@hidden> wrote:On Mar 11, 2009, at 4:03 PM, Jaroslav Hajek wrote:On Wed, Mar 11, 2009 at 8:53 AM, Ben Abbott <address@hidden> wrote:On Mar 11, 2009, at 3:33 PM, Jaroslav Hajek wrote:On Wed, Mar 11, 2009 at 8:24 AM, Ben Abbott <address@hidden> wrote:I noticed that fileparts give an error when the full-file containsasian characters. ctave:209> fileparts ("System/Library/Fonts/junk.ttf")error: subscript indices must be either positive integers or logicals.error: called from: error:/Users/bpabbott/Development/mercurial/octave-3.1.53/scripts/ strings/strchr.mat line 40, column 19 error:/Users/bpabbott/Development/mercurial/octave-3.1.53/scripts/ miscellaneous/fileparts.mat line 30, column 10It appears that there is a simple fix for strchr, but it will dependupon the ascii equivalent for Asian fonts. I'm seeing negative values. fullfile = "System/Library/Fonts/junk.ttf"; octave:211> double(fullfile) ans = Columns 1 through 16:83 121 115 116 101 109 47 76 105 98 114 97114 121 47 70 Columns 17 through 32:111 110 116 115 47 -27 -115 -114 -26 -106 -121 -25-69 -122 -23 -69 Columns 33 through 37: -111 46 116 116 102Can anyone tell me what the permissible range for integer values ofAsian characters is?I think a char->double conversion is supposed to yield nonnegative values, so this seems buggy.I'm planning to patch strchr, any reason I shouldn't do that?I don't think there's a bug in strchr. This is clearly caused by thenegative values.For Asian fonts the values are 16bit ... unsigned or signed I don't know.No, they're not. See your own example. Octave has no support for UTF8strings, so unless "char" is more than 8 bits, the result will be an8-bit number. Thus, "strchr" won't search for Japanese characters (butthis does not mind here, since you need to find just an ascii character). Currently, the sign of char -> double is left up to the compiler, which I don't think is good. I think we should guarantee that to bepositive, same what Matlab does. Shall I make a patch, or do you wishto do it?ok, I'm hadn't considered how many bits Octave was using for characters.In any even, please to make a patch (I'm not competent enough in c+ + to doit myself).In the meantime, I'll avoid using fileparts and strchr when there may beAsian characters present. BenFix is uploaded. regards
It works as expected! Thanks Ben
[Prev in Thread] | Current Thread | [Next in Thread] |