[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Octave-bug-tracker] [bug #53685] textscan() with Delimiter specified al
From: |
Rik |
Subject: |
[Octave-bug-tracker] [bug #53685] textscan() with Delimiter specified always treats multiple delimiters as one |
Date: |
Wed, 18 Apr 2018 13:25:14 -0400 (EDT) |
User-agent: |
Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:55.0) Gecko/20100101 Firefox/55.0 |
Update of bug #53685 (project octave):
Status: None => Need Info
_______________________________________________________
Follow-up Comment #1:
Can you run your test with Matlab and see if it fails there? textscan() is a
very complicated function, and unfortunately Octave has to try and reproduce
Matlab behavior exactly because many users have gotten used to its
idiosyncracies.
As a start, see the documentation at
http://www.mathworks.com/help/matlab/ref/textscan.html.
For issue #1, "spaces in the string fields act as delimiters, creating extra
fields", Matlab says
Within each row of data, the default field delimiter is white-space.
White-space can be any combination of space (' '), backspace ('\b'), or tab
('\t') characters. If you do not specify a delimiter, then:
the delimiter characters are the same as the white-space characters. The
default white-space characters are ' ', '\b', and '\t'. Use the 'Whitespace'
name-value pair argument to specify alternate white-space characters.
textscan interprets repeated white-space characters as a single
delimiter.
It is essentially as if they set the Delimiter option to whitespace and
MultipleDelimsAsOne to true.
So, if your delimiter is actualy a comma then you will need to say so. For
example,
textscan ("A,B C,D", "%s", "delimiter", ',')
ans =
{
[1,1] =
{
[1,1] = A
[2,1] = B C
[3,1] = D
}
}
which works to preserve the space in the string.
For the last issue, I find that MultipleDelimsAsOne works. For example
textscan ("A,,,B C,D", "%s", "delimiter", ',')
ans =
{
[1,1] =
{
[1,1] = A
[2,1] =
[3,1] =
[4,1] = B C
[5,1] = D
}
}
As expected, there were two empty strings created where there were extra
commas. Now switching on the MultipleDelimsAsOne option
textscan ("A,,,B C,D", "%s", "delimiter", ',', "multipledelimsasone", 1)
ans =
{
[1,1] =
{
[1,1] = A
[2,1] = B C
[3,1] = D
}
}
I don't exactly know your file, but lets say your trying to read a number,
string, number, string.
textscan ("1.1,A,2.2,B C", "%f %s %f %s", "delimiter", ',')
{
[1,1] = 1.1000
[1,2] =
{
[1,1] = A
}
[1,3] = 2.2000
[1,4] =
{
[1,1] = B C
}
}
That seems right.
textscan ("1.1,A,,B C", "%f %s %f %s", "delimiter", ',')
If one of the numbers is missing, that seems to work too.
textscan ("1.1,A,,B C", "%f %s %f %s", "delimiter", ',')
ans =
{
[1,1] = 1.1000
[1,2] =
{
[1,1] = A
}
[1,3] = NaN
[1,4] =
{
[1,1] = B C
}
}
Is there a one-line example that shows how Octave textscan is not behaving
identically to the Matlab textscan function?
_______________________________________________________
Reply to this item at:
<http://savannah.gnu.org/bugs/?53685>
_______________________________________________
Message sent via/by Savannah
http://savannah.gnu.org/