octave-bug-tracker
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Octave-bug-tracker] [bug #65244] Simplify code complexity of perms.cc


From: Hendrik K
Subject: [Octave-bug-tracker] [bug #65244] Simplify code complexity of perms.cc
Date: Sat, 10 Feb 2024 00:23:48 -0500 (EST)

Follow-up Comment #18, bug#65244 (group octave):

The discussion is interesting. To shed a little light on the practical
implications, one needs to keep in mind that perms deals with factorials and
factorials explode faster than anybody usually imagines and that time
complexity is only an indication for CPU execution time. In a nutshell:

a) The code simplification using mutual comparisons is in nearly all cases
faster(!) than using the (previous) sort approach.
b) Either removing constexpr or taking the C++11 compatible constexpr solution
depicted by Markus is both fine.
c) For the "factorial" function in perms, one can use both doubles or long
doubles.


Personally (suprise suprise - this reflects my coding past decisions...), I
prefer using constexpr via Markus's solution and long doubles for the
factorial function used by perms, but I am fine either way as in terms of
practical implications, the different solutions for b) and (c) are not
different...



More detailed analysis
======================

Whilest sorting has a lower time complexity (typical n*log n - worst case is
also n*n by the way), ones needs to post-process the sort result in order to
determine the number of unique values and how often they occur increasing both
code complexity and time complexity. So one has in reality a complexity of
n*logn + (n-1).
Also we do not need the sorted (interim) result, so we overfill the
requirements and the sort overhead to reserve memory, fill with values,
rearrange values etc. is "wasted". 
 
 
Mutual comparison always requires n*(n-1)/2 complexity and does not have
equivalent (unnecessary) overhead.

for i=1:20
 printf("n = %2d, Quicksort approach complexity %8.1f versus mutual comparison
approach complexity %6d \n", i, i*log(i) + i - 1, i * (i-1)./2);
endfor
n =  1, Quicksort approach complexity      0.0 versus mutual comparison
approach complexity      0
n =  2, Quicksort approach complexity      2.4 versus mutual comparison
approach complexity      1
n =  3, Quicksort approach complexity      5.3 versus mutual comparison
approach complexity      3
n =  4, Quicksort approach complexity      8.5 versus mutual comparison
approach complexity      6
n =  5, Quicksort approach complexity     12.0 versus mutual comparison
approach complexity     10
n =  6, Quicksort approach complexity     15.8 versus mutual comparison
approach complexity     15

n =  7, Quicksort approach complexity     19.6 versus mutual comparison
approach complexity     21
n =  8, Quicksort approach complexity     23.6 versus mutual comparison
approach complexity     28
n =  9, Quicksort approach complexity     27.8 versus mutual comparison
approach complexity     36
n = 10, Quicksort approach complexity     32.0 versus mutual comparison
approach complexity     45
n = 11, Quicksort approach complexity     36.4 versus mutual comparison
approach complexity     55
n = 12, Quicksort approach complexity     40.8 versus mutual comparison
approach complexity     66
n = 13, Quicksort approach complexity     45.3 versus mutual comparison
approach complexity     78
n = 14, Quicksort approach complexity     49.9 versus mutual comparison
approach complexity     91
n = 15, Quicksort approach complexity     54.6 versus mutual comparison
approach complexity    105
n = 16, Quicksort approach complexity     59.4 versus mutual comparison
approach complexity    120
n = 17, Quicksort approach complexity     64.2 versus mutual comparison
approach complexity    136
n = 18, Quicksort approach complexity     69.0 versus mutual comparison
approach complexity    153
n = 19, Quicksort approach complexity     73.9 versus mutual comparison
approach complexity    171
n = 20, Quicksort approach complexity     78.9 versus mutual comparison
approach complexity    190


For up to n=6 the mutual comparison time complexity is actually better than
using quicksort, so using mutual comparison will be faster.

I took n=7 and measured the processing time (using
std::chrono::high_resolution_clock in the code of GetPerms mutual comparison):

n = 7; a = perms(floor(rand(1,n)*n),"unique");

The CPU proportion spent to determine the unique elements and their frequency
was typically less than 1% for n=7. Factorials grow materially faster than
n*log(n) so higher n will mean much lower proportion spent for this task.

Then I measured the GetPerms using the sort approach, and actually the (not
used) sort overhead is pretty material leading to MORE time using sort
compared to using the mutual comparisons for n = 7,8,9,10  

Note that the overhead to check parameters, extract them etc of perms itself
before calling GetPerms was NOT included and which is proportionally material
for smaller n.

Conclusions:
In practice mutual comparison is (nearly always) faster and the time spent in
the code part to perform "unique" is only a tiny proportion of the total
execution time, so it does not matter whether one uses constexpr or not.




Perms produces a matrix of the size (n!, n). The (minimal assuming int8)
memory requirements are

s = {  "bytes", "kilobytes", "megabytes", "gigabytes", "terabytes",
"pentabytes", "exabytes", "zettabytes" };
for i=1:20
  N = factorial(i) * i;
  s_i = 1;
  r_i = N;
  st = "";
  if N > flintmax(0.0)
    st = [ st " flintmax reached! "];
  endif
  if N >= intmax(uint64(0));
    st = [ st " 64 bit limit reached! "];
  endif
  while (r_i >= 1024)
    s_i++;
    r_i ./= 1024;
  endwhile
  printf("n = %2d, Required memory size: ~ %7.1f %10s %s\n", i, r_i,
s{s_i},st);
endfor

n =  1, Required memory size: ~     1.0      bytes
n =  2, Required memory size: ~     4.0      bytes
n =  3, Required memory size: ~    18.0      bytes
n =  4, Required memory size: ~    96.0      bytes
n =  5, Required memory size: ~   600.0      bytes
n =  6, Required memory size: ~     4.2  kilobytes
n =  7, Required memory size: ~    34.5  kilobytes
n =  8, Required memory size: ~   315.0  kilobytes
n =  9, Required memory size: ~     3.1  megabytes
n = 10, Required memory size: ~    34.6  megabytes
n = 11, Required memory size: ~   418.7  megabytes
n = 12, Required memory size: ~     5.4  gigabytes
n = 13, Required memory size: ~    75.4  gigabytes
n = 14, Required memory size: ~     1.1  terabytes
n = 15, Required memory size: ~    17.8  terabytes
n = 16, Required memory size: ~   304.5  terabytes
n = 17, Required memory size: ~     5.4 pentabytes
n = 18, Required memory size: ~   102.4 pentabytes  flintmax reached!
n = 19, Required memory size: ~     2.0   exabytes  flintmax reached!
n = 20, Required memory size: ~    42.2   exabytes  flintmax reached!  64 bit
limit reached!



Conclusions:
There is no practical difference whether for the factorial function one uses
doubles or long doubles, except if one happens to have a computer with 100+
pentabytes of memory and a near eternity of CPU time to create and fill so
much memory.... 
Either way is fine.



    _______________________________________________________

Reply to this item at:

  <https://savannah.gnu.org/bugs/?65244>

_______________________________________________
Message sent via Savannah
https://savannah.gnu.org/




reply via email to

[Prev in Thread] Current Thread [Next in Thread]