octave-maintainers
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: AW: textscan


From: Philip Nienhuis
Subject: Re: AW: textscan
Date: Sun, 29 Apr 2012 23:06:56 +0200
User-agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.1.11) Gecko/20100701 SeaMonkey/2.0.6

Graf, Alexander wrote:
> Dear Philip, der Ben,
>
> I´m not sure I understand every detail of your discussion, but I
> can assure you that you do not need to go into trouble thinking
> how you can attribute this to me - it was a very small thing
> after all and it was a pleasure and very helpful for me to see
> your quick reaction and further explanations.

Well, you reported the bug AND proposed the actual fix. That's sufficient to earn eternal fame in the changelog :-)

> BTW, I had another small problem meanwhile but it might be
> something very particular to my input dataset, and thus not
> relevant for most applications. Also it seems to happen within
> strread and I didn´t bother to find out how, maybe it´s a wanted
> property of strread: It seems that the output arrays of the
> different columns can sometimes have different row numbers (at
> least by 1, this happened with my data, maybe because of readin
> problems with the last row).

What do you mean exactly?

1. Are some entries ("rows") in some columns skipped (missing)? IOW, do the output columns get out-of-phase?
Like (note missing b2):
a1  b1  c1
a2  b3  c2
a3  b4  c3
a4      c4

 -- OR --

2. Are all expected data entries in all columns present (including empty fields) but do the columns have unequal length in the original file and is that reflected in the output?
Like:
a1  b1  c1
a2  b2  c2
a3  b3  c3
a4  b4  c4
a5

Could you send me a specimen file where you get this behavior? (if not too big) - I can use that for debugging.


>                              If collectoutput is active now, I
> get an error because I try to concatenate differently long
> arrays. I could easily fix this, sufficiently well for my own
> application, and the fix is quite ugly so probably nothing to
> include in afurther patch.

As to ugliness, well, strread.m by itself is no beauty either. As I'm responsible for most of that (strread.m became some sort of step-child of me), I think I'm entitled to say so :-)
Yet if you stumbled upon a real bug, I think it should be fixed.

Before we get that far, I first need an answer to my questions above.


>                            When written in 1 previously empty
> line to avoid any line number shifts, it looks like this
>
> Line 279
> for ir=1:numel(C);siz(ir)=size(C{ir},1);end;msiz=max(siz');
>for ir=1:numel(C);if siz(ir)<msiz;C{ir}=[C{ir}; 0];end;end;%%

What I'm thinking of:
It may be dependent on the presence of a newline (or rather: EOL) at the very end of the text file. If present, strread.m ensures all output columns are of equal length (too short output columns are padded with empty entries); if absent, all output columns are as long as they are in the original data file.
FYI, this is a Matlab compatibility feature of strread.

Hmmm... Only now I see that this info is lacking in strread's help text.

If it is case 2. above, perhaps textscan.m (and perhaps textread.m as well) should check for presence of an EOL as last character before handing the text string to strread.m, and add it if absent.


Philip



Von: Ben Abbott address@hidden
> On Apr 28, 2012, at 10:59 AM, Philip Nienhuis wrote:
>
>> Ben Abbott wrote:
>>> On Apr 27, 2012, at 12:31 PM, Philip Nienhuis wrote:
>>>
>>>> Philip Nienhuis wrote:
>>>>> <follow-up cc'd to octave-maintainers ML>
>>>>>
>>>>> Graf, Alexander wrote:
>>>>> <snip>
>>>>>> you recently posted a patch to make textscan faster.
>>>> <snip>
>>>>>> However, I think that in line 168 (of the readily patched
>>>>>> file<http://savannah.gnu.org/patch/download.php?file_id=25423>)
>>>>>>
>>>>>> fskipl (fid, varargin{headerlines + 1});
>>>>>>
>>>>>> varargin should be replaced by args. At least that was necessary
>>>>>> to run it with my script, where some parameters unknown to
>>>>>> octave-textscan might have caused empty entries in varargin that
>>>>>> are cleared out in args.
>>>>>
>>>>> Hmmm, I think you are right. Good catch!
>>>>> As my patch has already been pushed, I'll (try to) prepare a changeset
>>>>> this weekend.
>>>>
>>>> Changeset attached.
>>>>
>>>> I also adapted a copyright string in strread.m (was a bit overdue)
>>>>
>>>> Could one of the core devs please check if it is OK and then push it, please?
>>>>
>>>> Thanks,
>>>>
>>>> Philip
>>>> <textscan_strread.patch>
>>>
>>> Minor detail, but is the summary correct ?
>>>
>>>       "Correct assignment to wrong variable f headerlines processing"
>>>
>>> Should the "f" be "for" ?
>>
>> Yes, Ben.
>> I'd rather also attribute it to "Alexander Graf<address@hidden>" as he proposed this fix but I don't know how to get that together. Plain edit? wouldn't that mix up the .hg stuff?
>> (my Mercurial knowledge is close to embarrassing)
>>
>> Could you amend the changeset, please?
>>
>> TIA, Philip
>
> Yes, a simple text editor can be used to change the "User" attribute.
>
> I've pushed the changeset.
>
>          http://hg.savannah.gnu.org/hgweb/octave/rev/e97ec01d4157
>
> Ben
>


reply via email to

[Prev in Thread] Current Thread [Next in Thread]