On May 2, 2006, at 7:25 AM, David Bateman wrote:
Paul Kienzle wrote:
David,
octave now goes quickly through the regular expression portion of
the code.
I haven't yet confirmed that the results are consistent with matlab.
The next portion involves for loops such as the following:
tag = cell(number_of_tags,4);
for i=1:number_of_tags
tag{i,1} = xml(tag_start(i):tag_end(i))
end
which for 10000 tags is slow.
Are there octave routines for splitting/joining strings into cells
which are fast?
- Paul
Paul,
Hey, I'm on holidays at the moment, and so have a little time. What
about the attached implementation of mat2cell? With this you should
be able to repalce the above code with
tag = cell(number_of_tags,4);
tag{:,1} = mat2cell (xml, 1, tag_end - tag_start);
mat2cell partitions the matrix into cells. The xml2cell code extracts
substrings.
The following does what I expect:
xml='<eh><bee> <see> deed </see> </bee></eh>';
tag_start = find(xml=='<');
tag_end = find(xml=='>');
pieces = [ tag_start; tag_end+1 ];
partition = diff([1;pieces(:);length(xml)+1]);
tag_name = mat2cell (xml, 1, partition) (2:2:end);
tags = cell(length(tag_start),4);
tags(:,1) = tag_name';