[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Built-in base2dec and dec2base
From: |
Daniel J Sebald |
Subject: |
Re: Built-in base2dec and dec2base |
Date: |
Sun, 19 Aug 2012 20:58:16 -0500 |
User-agent: |
Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.2.24) Gecko/20111108 Fedora/3.1.16-1.fc14 Thunderbird/3.1.16 |
Rik,
I've gone over the builtin base2dec/dec2base code and created a second
version of the patch file.
https://savannah.gnu.org/patch/index.php?7825
Changes are:
1) Added checks for non-negative and non-fraction.
2) Vary the width of division (via template) based on max size to bump
up efficiency.
3) Modifications to ensure unsigned long numbers are done properly.
I think everything is covered now, and I've verified all the tests and a
little more.
What I propose is that someone with more advanced understanding of the
Octave classes suggesting ways to program this slightly better by
utilizing code reuse or other member functions. John or you could also
make some mods and create a new patch on Savannah. (I'd appreciate that
actually as it would help understand the Octave classes better and the
quirks such as having to work with "uint64" as opposed to "unsigned
long" in order to get proper results.)
I'm including some numbers below because the Savannah site tosses out
extra spaces so columns don't come through.
Dan
PERFORMANCE
-----------
I've redone some performance tests and found about 10% decrease in times
of for the built-in function due to the added checks described in (1)
above. Here are some performance results using a 10e6 size vector
(strings or random numbers, depending upon base2dec vs. dec2base). The
specs are seconds of CPU consumption.
COMMAND BUILT-IN CURRENT
VERSION SCRIPT
VERSION
_______ ________ _______
bin2dec(<char matrix>) 0.15298 0.56591
bin2dec(<cell vector>) 0.24196 1.7647
hex2dec(<char matrix>) 0.15298 0.55892
hex2dec(<cell vector>) 0.23996 1.7747
base2dec(<char mat>, '01') 0.18997 0.53792
base2dec(<cell vec>, '01') 0.27496 1.7217
dec2bin(<int vector>) 0.25796 1.1488
dec2bin(<cell vector>) 0.26296 3.6314
dec2hex(<int vector>) 0.14098 0.30096
dec2hex(<cell vector>) 0.16097 2.7606
dec2base(<int vec>, '01') 0.25696 1.1498
dec2base(<cell vec>, '01') 0.25896 3.5345
Furthermore, here are some related times:
cellstr(<char matrix>) 0.82187
num2cell(<int vector>) 0.064990
DIVISION SIZE
-------------
I've also done some tests to check if changing the division size is
worth it. For example, to investigate 8 bits, I checked 2^8-1, 2^8 and
2^9. The idea is to try and get a feel for what portion of CPU
consumption increase is due to the one character longer number and what
is due to the wider division.
Judging from the numbers, it seems clear that there is some increase
attributable to the wider division noticeable for the transition from a
32-bit divide to a 64-bit divide.
Here are some performance results using a 10e6 size vector, but non
random and at the border of transition from 8, 16, 32, 64 bit. The
specs are seconds of CPU consumption.
COMMAND BUILT-IN CURRENT
VERSION SCRIPT
VERSION
_______ ________ _______
dec2bin(2^8-1) 0.17397 0.68489
dec2bin(2^8) 0.19097 0.74089
dec2bin(2^9) 0.20797 0.88487
dec2bin(2^16-1) 0.25996 1.3968
dec2bin(2^16) 0.26996 1.3738
dec2bin(2^17) 0.28396 1.4608
dec2bin(2^32-1) 0.45993 2.6836
dec2bin(2^32) 0.58791 2.6416
dec2bin(2^33) 0.63190 2.6996
dec2bin(2^64-1) 1.1818 5.2662
dec2bin(2^64) 1.1788 5.2262
dec2bin(2^65) 1.1738 5.2612
dec2bin({2^8-1}) 0.17297 3.2095
dec2bin({2^8}) 0.18497 3.1865
dec2bin({2^9}) 0.20297 3.2615
dec2bin({2^16-1}) 0.26596 3.8494
dec2bin({2^16}) 0.28996 3.7954
dec2bin({2^17}) 0.28696 3.8954
dec2bin({2^32-1}) 0.45493 5.1152
dec2bin({2^32}) 0.59191 5.0582
dec2bin({2^33}) 0.63190 5.3472
dec2bin({2^64-1}) 1.1768 7.7158
dec2bin({2^64}) 1.1788 7.6908
dec2bin({2^65}) 1.1848 7.6748
On 07/29/2012 11:30 PM, Daniel J Sebald wrote:
I've done a first pass of built-in versions of base2dec() and
dec2base(), and I put a patch in the features portion of SourceForge
Carnë informed me I used the wrong tracker. (Thanks Carnë.) I moved the
patch to:
https://savannah.gnu.org/patch/index.php?7825
...
I tested a version of dec2base that uses the C++ library div() routine
http://www.cplusplus.com/reference/clibrary/cstdlib/div/
to generate quotient and remainder at the same time. Where I thought it
might cut the computations some, it turns out to be slower than using
operate divide followed by multiply and subtract. My guess is that the
div() routine for some reason doesn't know how to optimize for the
processor or the CPU doesn't have remainder as part of its div
instruction. Consequently it just introduces extra overhead of the div_t
structure.
COMMAND BUILT-IN BUILT-IN
DIVIDE div_t
MULTIPLY div()
_______ ________ _______
dec2bin(<int vector>) 0.22697 0.25996
dec2bin(<cell vector>) 0.23996 0.28496
dec2hex(<int vector>) 0.11998 0.12998
dec2hex(<cell vector>) 0.13898 0.16198
dec2base(<int vec>, '01') 0.22497 0.24796
dec2base(<cell vec>, '01') 0.23996 0.28396
Dan
--
Dan Sebald
email: daniel(DOT)sebald(AT)ieee(DOT)org
URL: http://www(DOT)dansebald(DOT)com
- Re: Built-in base2dec and dec2base,
Daniel J Sebald <=