On Apr 2, 2013, at 11:42 PM, John W. Eaton wrote:
On 04/02/2013 07:40 PM, Ben Abbott wrote:
On Apr 2, 2013, at 6:35 PM, Carnë Draug wrote:
On 2 April 2013 18:04, Ben Abbott<address@hidden> wrote:
On Apr 2, 2013, at 12:00 PM, Carnë Draug wrote:
On 2 April 2013 13:02, Ben Abbott<address@hidden> wrote:
I've added a "conventional" to the possible delimitertypes, and selected this
as the default when all delimiters are scalar characters. I've also enabled 2D character
array inputs. The following replacements may be made into OF, and Octave core, to use
Jaroslav's implementation. A changeset to core to use these substitutions will also be
needed.
- strsplit (str, del)
+ strsplit (str, del, "collapsedelimiters", false, "delimitertype",
"conventional")
- strsplit (str, del, false)
+ strsplit (str, del, "collapsedelimiters", false, "delimitertype",
"conventional")
- strsplit (str, del, true)
+ strsplit (str, del, "collapsedelimiters", true, "delimitertype",
"conventional")
That means that OF packages with this change, will only work after 3.8
being released, and the ones without the change will stop working once
3.8 is released, so the releases would have to be timed.
Instead, I propose we add the old strsplit in the private directory of
each package. Then, once 3.8 gets released, we can remove them and
make the change as each package gets re-released.
That's a good idea!
I have just made the change on revision 11779. I have also created a
new task on the trackers https://savannah.gnu.org/task/index.php?12561
so we don't forget what packages got affected.
Finally, geometry and mechanics package need to be fixed some other
way since they make use of it on PKG_DEL and PKD_ADD scripts.
Carnë
I pushed a slightly modified changeset.
http://hg.savannah.gnu.org/hgweb/octave/rev/5be43435bd5b
In a recent changeset, you made the following change:
- list = strsplit (list, " \n\t", true);
+ list = strsplit (list, " \n\t", true, "delimitertype", "legacy");
If people will have to make a change in their code to recover previous behavior
does is really help to introduce a new option like this? Isn't this equivalent
to
list = strsplit (list, {" ", "\n", "\t"})
? Why not just tell people to make this change instead of introducing the
"legacy" delimiter type?
The "legacy" delimiter type was introduced because Jaroslav's implementation
was roughly 30x faster than the regexp approach. In many instances the slow down
probably isn't a big deal, but for strread / textscan the slow down would be a heavy
burden.
Also, the new strsplit does not seem to be working correctly for single-quoted
strings that contain escape sequences. I think that
strsplit ("foo\tbar", '\t')
should split on the TAB, but it is currently returning the original string for
me.
Opps. I hadn't considered single quoted strings. Your example does work in
Matlab. How is this sort of thing handled in other places? It is sufficient
to just ...
delimiter = sprintf (delimiter) ?
This is starting to look a bit messy. Maybe we should revert this change until
we have a better plan for the transition. Maybe it would be good to give
people some advance warning before making this change all at once.
Sorry, I hadn't considered that. I say Jordi's request for someone to take
this on, and it looked like a short project I could find the time for.
https://mailman.cae.wisc.edu/pipermail/octave-maintainers/2013-March/032734.html
To revert, the following three changesets need to be included.
http://hg.savannah.gnu.org/hgweb/octave/rev/1de4ec2a856d
http://hg.savannah.gnu.org/hgweb/octave/rev/5be43435bd5b
http://hg.savannah.gnu.org/hgweb/octave/rev/61989cde13ae
Ben