[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: bin2dec behavior different from Matlab?
From: |
Daniel J Sebald |
Subject: |
Re: bin2dec behavior different from Matlab? |
Date: |
Fri, 16 Mar 2012 22:15:26 -0500 |
User-agent: |
Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.2.24) Gecko/20111108 Fedora/3.1.16-1.fc14 Thunderbird/3.1.16 |
On 03/16/2012 07:38 PM, Rik wrote:
On 03/16/2012 04:53 PM, Daniel J Sebald wrote:
One can do this. In general, cellstr are slower than using indexing on
character arrays. I tried the following and it works
s = char (strrep (cellstr (s), " ", ""));
s = strjust (s, "right");
Why is strjust necessary here?
If the white space is removed from the string the string will already be
justified as a consequence. Remove the strjust() command and benchmark
again.
The algorithm depends on the character matrix being right justified. The
char function produces a left-justified matrix. Try 'char ("1", "111")' as
an example.
OK, we aren't thinking along the same line. What I'm wondering is if
there is some method of doing the bin2dec group of functions without the
character matrix approach. With the advent of the cell array, the group
of routines that worked with character strings in a matrix configuration
sort of fell out of favor. So now people programming scripts might
think in terms of a cell array of character strings of binary numbers.
That data might come from a file or whatever; it's just that it is more
convenient to work with strings contained as cell array.
Also, strrep may not be so efficient because it is general. It works
with two strings. This process is only interested in the one character '
', so the isspace or != test might prove much faster.
You can use indexing for deletion within ordinary arrays but not for cell
strings. Try ' cstr = {"1 0 1"; "1"}; cstr(isspace(cstr)) = "" ' and it
will simply error out. regexprep() would work but it is slower that strrep.
Also, there may be a technique of using cellfun instead of converting
back to char that can save time.
I've benchmarked cellfun many times and it is slower than straight indexing.
There are a lot of optimization methods to explore here.
Feel free to improve the code. It is available in Mercurial. The
changeset is 14472:e995b1c97e13.
To create a test matrix I used
tvec = char (randi ([48 49], 1e6, 10));
tvec(randi(1e7, 1e6,1)) = " ";
which creates 1 million 10 digit binary numbers with about 10% of the
values being spaces.
The input you are choosing is a character matrix. Let's also create the
equivalent cell array of character strings:
ctvec = cellstr(tvec);
I'm saying that ctvec is more likely to be the user's starting point
these days, and that to convert that ctvec to a character array might
not be the way to go.
Unfortunately, I don't have the latest Octave and until I can get
Mercurial working on my machine I can't do benchmark comparisons.
From my rough estimate, you have a fast machine. I have a Xeon quad
core running at of 3GHz and I'm not getting near the times you are with
bin2dec. Perhaps there has been a lot of optimization in Octave over
the past couple versions.
Well, here is what I'm doing for a simple test, and you can experiment
with this little bit of code on your machine. I'm attaching a script
file called test_bin2dec.m which uses cellfun() to implement bin2dec.
It is bare bones and doesn't do any sanity check on the characters being
between '0' and '1', but my point is to illustrate there is a different
approach.
Now, if cellfun() turns out to be slightly slower, the clean code might
give it an advantage. Or maybe we'll have to ask John to look at the
internal cellfun() routine because the point of that routine is looping
efficiency.
Running this little script
tvec = char (randi ([48 49], 1e6, 10));
tvec(randi(1e7, 1e6,1)) = " ";
ctvec = cellstr(tvec);
cpuzero = cputime();
junk = bin2dec(tvec);
cputime() - cpuzero
cpuzero = cputime();
junk = test_bin2dec(ctvec);
cputime() - cpuzero
on my machine produces
octave:19> tvec = char (randi ([48 49], 1e6, 10));
octave:20> tvec(randi(1e7, 1e6,1)) = " ";
octave:21> ctvec = cellstr(tvec);
octave:22>
octave:22> cpuzero = cputime();
octave:23> junk = bin2dec(tvec);
octave:24> cputime() - cpuzero
ans = 423.04
octave:25>
octave:25> cpuzero = cputime();
octave:26> junk = test_bin2dec(ctvec);
octave:27> cputime() - cpuzero
ans = 68.398
So two questions come to mind from this:
1) The cellfun() based approach is five times faster than the version
3.2.4 approach (granted, I left out several things), and the conversion
ctvec = cellstr(tvec) is relatively fast compared to these benchmark
times so maybe converting to a cell array approach is better. (John's
point I believe.) There may be a better approach than the power()
routine, but I was just trying to illustrate cellfun().
2) What machine are you using that is so fast?!
Dan
test_bin2dec.m
Description: Text Data
- Re: bin2dec behavior different from Matlab?, (continued)
- Re: bin2dec behavior different from Matlab?, Rik, 2012/03/16
- Re: bin2dec behavior different from Matlab?, John W. Eaton, 2012/03/16
- Re: bin2dec behavior different from Matlab?, Rik, 2012/03/16
- Re: bin2dec behavior different from Matlab?, Daniel J Sebald, 2012/03/16
- Re: bin2dec behavior different from Matlab?, Rik, 2012/03/16
- Re: bin2dec behavior different from Matlab?,
Daniel J Sebald <=
- Re: bin2dec behavior different from Matlab?, Daniel J Sebald, 2012/03/17
- Re: bin2dec behavior different from Matlab?, Jordi Gutiérrez Hermoso, 2012/03/17
- Re: bin2dec behavior different from Matlab?, Daniel J Sebald, 2012/03/17
- Re: bin2dec behavior different from Matlab?, Daniel J Sebald, 2012/03/17
- Re: bin2dec and cellfun improvements, Rik, 2012/03/18
- Re: bin2dec and cellfun improvements, Daniel J Sebald, 2012/03/18
- Re: bin2dec and cellfun improvements, Daniel J Sebald, 2012/03/20
- Re: bin2dec behavior different from Matlab?, Daniel J Sebald, 2012/03/16
Re: bin2dec behavior different from Matlab?, ahowe42, 2012/03/22