octave-maintainers
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: C++ version of regexprep.cc


From: David Bateman
Subject: Re: C++ version of regexprep.cc
Date: Tue, 02 May 2006 16:57:57 +0200
User-agent: Thunderbird 1.5 (Windows/20051201)

Paul Kienzle wrote:

On May 2, 2006, at 7:25 AM, David Bateman wrote:

Paul Kienzle wrote:
David,

octave now goes quickly through the regular expression portion of the code.

I haven't yet confirmed that the results are consistent with matlab.

The next portion involves for loops such as the following:

  tag = cell(number_of_tags,4);
  for i=1:number_of_tags
   tag{i,1} = xml(tag_start(i):tag_end(i))
  end

which for 10000 tags is slow.

Are there octave routines for splitting/joining strings into cells
which are fast?

- Paul

Paul,

Hey, I'm on holidays at the moment, and so have a little time. What about the attached implementation of mat2cell? With this you should be able to repalce the above code with

tag = cell(number_of_tags,4);
tag{:,1} = mat2cell (xml, 1, tag_end - tag_start);

mat2cell partitions the matrix into cells. The xml2cell code extracts substrings.

The following does what I expect:

    xml='<eh><bee>   <see> deed </see>  </bee></eh>';
    tag_start = find(xml=='<');
    tag_end = find(xml=='>');
    pieces = [ tag_start; tag_end+1 ];
    partition = diff([1;pieces(:);length(xml)+1]);
    tag_name = mat2cell (xml, 1, partition) (2:2:end);

    tags = cell(length(tag_start),4);
    tags(:,1) = tag_name';
I just noted, you didn't state whether this improved the speed of your xml code sufficiently or not... Or whether there is a another speed problem elsewhere.

D.


reply via email to

[Prev in Thread] Current Thread [Next in Thread]