[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Octave-bug-tracker] [bug #41965] __makeinfo__ assumes unsigned char for

From: anonymous
Subject: [Octave-bug-tracker] [bug #41965] __makeinfo__ assumes unsigned char for input text when it should be signed char
Date: Wed, 26 Mar 2014 21:53:22 +0000
User-agent: Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/534.34 (KHTML, like Gecko) konqueror/4.8.4 Safari/534.34


                 Summary: __makeinfo__ assumes unsigned char for input text
when it should be signed char
                 Project: GNU Octave
            Submitted by: None
            Submitted on: Wed 26 Mar 2014 09:53:21 PM UTC
                Category: Octave Function
                Severity: 3 - Normal
                Priority: 5 - Normal
              Item Group: Incorrect Result
                  Status: None
             Assigned to: None
         Originator Name: Alan W. Irwin
        Originator Email: address@hidden
             Open/Closed: Open
         Discussion Lock: Any
                 Release: 3.6.2
        Operating System: GNU/Linux



The __makeinfo__ function writes documentation strings to a temporary file

fwrite (fid, text);

That function assumes the documentation text is contained in an array of
unsigned char's by default.  That's fine for documentation written in 7-bit
ascii, but when the text contains 8-bit utf-8 characters, those 8-bit bytes
are replaced by null characters as can be seen by the following simple

function test_fwrite_utf8
fid = fopen("test.out", "w")
  fwrite(fid, "The unicode character, ≥, is outputn")

The net result of running that function is

address@hidden> od -a test.out
0000000   T   h   e  sp   u   n   i   c   o   d   e  sp   c   h   a   r
0000020   a   c   t   e   r   ,  sp nul nul nul   ,  sp   i   s  sp   o
0000040   u   t   p   u   t  nl

If an additional argument "schar" is used for fwrite, all is well and
the utf8 characters are written to the file without issues which appears to
prove that Octave strings are generally represented as an array of signed
char's (as opposed to an array of unsigned characters as assumed by the
__makeinfo__ function.

Indeed if the following patch

--- __makeinfo__.m_original     2014-03-26 13:56:42.741106684 -0700
+++ __makeinfo__.m      2014-03-26 13:56:19.005546479 -0700
@@ -120,7 +120,7 @@
     if (fid < 0)
       error ("__makeinfo__: could not create temporary file");
-    fwrite (fid, text);
+    fwrite (fid, text, "schar");
     fclose (fid);

     ## Take action depending on output type

is applied, then the following code

## -*- texinfo -*-
## @deftypefn  {Function File} address@hidden =} fn (@var{x}, …)
## The unicode character, ≥, is output
## @end deftypefn

function test_utf8
printf("The unicode character, ≥, is outputn")

yields the following help result:

warning: function ./__makeinfo__.m shadows a core library function
octave:1> help test_utf8
`test_utf8' is a function from the file /home/irwin/test_octave/test_utf8.m

 -- Function File: A = fn (X, …)
     The unicode character, ≥, is output

Without the above patch, the help output is truncated just before
the utf-8 symbol for greater than or equal to.  This is expected because
fwrite writes null characters in place of the utf8 bytes without the above
patch to the __makeinfo__ function.


Reply to this item at:


  Message sent via/by Savannah

reply via email to

[Prev in Thread] Current Thread [Next in Thread]