octave-bug-tracker
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Octave-bug-tracker] [bug #57107] regexp functions fail on ISO-8859 inpu


From: A.R. Burgers
Subject: [Octave-bug-tracker] [bug #57107] regexp functions fail on ISO-8859 input
Date: Wed, 23 Oct 2019 05:38:51 -0400 (EDT)
User-agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/64.0.3282.140 Safari/537.36 Edge/18.17763

URL:
  <https://savannah.gnu.org/bugs/?57107>

                 Summary: regexp functions fail on ISO-8859 input
                 Project: GNU Octave
            Submitted by: arb
            Submitted on: Wed 23 Oct 2019 09:38:50 AM UTC
                Category: Octave Function
                Severity: 3 - Normal
                Priority: 5 - Normal
              Item Group: Matlab Compatibility
                  Status: None
             Assigned to: None
         Originator Name: 
        Originator Email: 
             Open/Closed: Open
         Discussion Lock: Any
                 Release: dev
        Operating System: GNU/Linux

    _______________________________________________________

Details:

Consider the next in ISO-8859 code page, with Degree C symbol, attached as
ISO-8859.txt. The unix file command reports the file type as ISO-8858 text.


  1T1(°C)


and this script reading it:


f = fopen('ISO-8859.csv');
str = fgets(f);
str(end) = '';
fclose(f);

regexprep(str, '1', '2')


results in this error with octave-6.0.0


error: regexprep: the input string is invalid UTF-8


Both octave-5.1.1 and matlab handle this transparently. I guess if dev does
not, this will lead to quite a few error reports in the future.

The error is also triggered by commands such as strsplit and strtrim since
they invoke regexp functions.
A more extensive test script uu.m is attached.






    _______________________________________________________

File Attachments:


-------------------------------------------------------
Date: Wed 23 Oct 2019 09:38:50 AM UTC  Name: ISO-8859.csv  Size: 12B   By: arb

<http://savannah.gnu.org/bugs/download.php?file_id=47732>
-------------------------------------------------------
Date: Wed 23 Oct 2019 09:38:50 AM UTC  Name: uu.m  Size: 671B   By: arb

<http://savannah.gnu.org/bugs/download.php?file_id=47733>

    _______________________________________________________

Reply to this item at:

  <https://savannah.gnu.org/bugs/?57107>

_______________________________________________
  Message sent via Savannah
  https://savannah.gnu.org/




reply via email to

[Prev in Thread] Current Thread [Next in Thread]