octave-maintainers
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Functions calls got slower between Octave 2.9.5 and 2.9.6 through .10


From: Luis F. Ortiz
Subject: Functions calls got slower between Octave 2.9.5 and 2.9.6 through .10
Date: Sun, 22 Apr 2007 21:13:22 -0400

First, let me make a confession: I'm a little obsessed by performance.
I don't mind reading assembly language and get a kick out of being able to make 
a tweak that makes things faster on 3 architectures and 2 different compilers.

But now I've run into something that has got me stumped.  it started with a complaint
by an engineer that Octave 2.9.9 was about 15% slower at solving a optimization problem 
using an octave-forge package than Octave 2.9.5.  After working with it a bit, I found other applications
that had slowed down as well.  I began to suspect that the cost of doing  a function call
had increased, so I started benchmarking with a simple little function that I use to test function 
call performance on various languages, the Ackermann function.  Here is the source code:

function uck = quack(m,n)
% global calls
% calls = calls + 1 ;
 if(m == 0 )
  uck = n + 1 ;
  return;
 end
 if(n == 0 )
  uck = quack(m-1,1);
  return;
 end
 uck = quack(m-1,quack(m,n-1));
return;

Now this little devil makes lots of function calls.   If you un-comment the 'global calls counter', for quack(3,6),
you will find a total of 172,233 function calls are made to compute the number 509.  Of course, you will need
to set your max_recursion_depth to 10+(2^(n+3)) because those function calls get nested verrrrry deeply.

So then I ran it under a few different versions of Octave to see how well they performed:


Octave VersionACK(3,6) X 2 Time
56.911
68.149
78.388
88.728
98.309
108.569

As you can see, something went awry between Octave 2.9.5 and 2.9.6, and has stayed bad since, to
the tune of 20% less goodness.

My next impulse was to try compiling octave 2.9.6 using the '-pg' switch and build
a profiling version of octave 2.9.6 and see what were the top 50 things octave spent its
timing doing while running this program:


% timecumulative secondsself secondscallsCalls Per Ack Name
5.262.052.053575071.038 unwind_protect::run_frame(std::string const&)
4.213.691.643444641.000 symbol_table::push_context()
4.175.321.633453371.003 symbol_table::clear()
4.056.91.583444641.000 symbol_table::pop_context()
3.728.351.45572736316.627 unwind_protect::add(void (*)(void*), void*)
3.499.711.363453291.003 charMatrix::charMatrix(string_vector const&)
3.110.921.2119167655.564 tree_identifier::rvalue(int)
3.0312.11.1824112487.000 symbol_record::push_context()
313.271.1724112487.000 symbol_record::pop_context()
2.9214.411.1417439215.063 symbol_table::lookup(std::string const&, bool, bool)
2.815.51.09593124217.219 std::vector<octave_value, std::allocator<octave_value> >::operator=(std::vector<octave_value, std::allocator<octave_value> > const&)
2.4516.450.9620758336.026 symbol_record::variable_reference()
2.117.270.8224261927.043 symbol_record::clear()
2.0418.070.835985058104.466 octave_value::~octave_value()
1.818.770.73453371.003 octave_user_function::do_multi_index_op(int, octave_value_list const&)
1.6819.420.66558227616.206 octave_value_list::operator=(octave_value_list const&)
1.5520.030.611017234429.531 Array<std::string>::~Array()
1.4920.610.5810389273.016 tree_binary_expression::rvalue()
1.4521.170.571186128234.434 octave_value::octave_value()
1.3821.710.543528681.024 tree_index_expression::rvalue(int)
1.3322.230.52608079217.653 octave_allocator::alloc(unsigned long)
1.3122.740.513451611.002 tree_parameter_list::define_from_arg_vector(octave_value_list const&)
1.2823.240.5364715610.588 saved_variable::restore(void*)
1.1523.690.4519167655.564 tree_identifier::rvalue()
1.1324.130.443528611.024 tree_argument_list::convert_to_const_vector(octave_value const*)
1.124.560.433489661.013 tree_simple_assignment::rvalue()
1.0524.970.4112149293.527 tree_statement::eval(bool, int, bool)
1.0325.370.43525171.023 tree_index_expression::rvalue()
0.9525.740.3712149313.527 unwind_protect::run()
0.9426.110.373453311.003 octave_user_function::subsref(std::string const&, std::list<octave_value_list, std::allocator<octave_value_list> > const&, int)
0.926.460.355211251.513 tree_statement_list::eval(bool, int)
0.8726.80.341185004234.401 octave_base_value::~octave_base_value()
0.8227.120.3220758336.026 tree_identifier::lvalue()
0.7427.70.296906572.005 tree_parameter_list::undefine()
0.7427.410.29 0.000 octave_base_value::clone() const
0.7227.980.28511516514.850 octave_value* std::__uninitialized_copy_aux<__gnu_cxx::__normal_iterator<octave_value const*, std::vector<octave_value, std::allocator<octave_value> > >, octave_value*>(__gnu_cxx::__normal_iterator<octave_value const*, std::vector<octave_value, std::allocator<octave_value> > >, __gnu_cxx::__normal_iterator<octave_value const*, std::vector<octave_value, std::allocator<octave_value> > >, octave_value*, __false_type)
0.6428.480.2531200639.058 octave_value::operator=(octave_value const&)
0.6428.230.25864063425.084 octave_base_scalar<double>::is_defined() const
0.5828.70.2324288137.051 std::_List_base<octave_value_list, std::allocator<octave_value_list> >::__clear()
0.5529.130.2219170095.565 lookup(symbol_record*, bool)
0.5529.350.2210454533.035 Array<octave_value (*)(octave_base_value const&, octave_base_value const&)>::checkelem(int, int, int)
0.5528.920.22643197118.672 octave_call_stack::instance_ok()
0.5429.770.21366671110.645 octave_value_list::elem(int)
0.5429.980.213489661.013 tree_simple_assignment::rvalue(int)
0.5429.560.21 0.000 octave_base_scalar<double>::nnz() const
0.5130.180.220750046.024 octave_lvalue::assign(octave_value::assign_op, octave_value const&)
0.5130.380.25208041.512 tree_if_command_list::eval()
0.4730.940.193453371.003 octave_user_function::restore_args_passed()
0.4731.130.193453371.003 octave_user_function::install_automatic_vars()
0.4930.570.1929661998.611 octave_value* std::__uninitialized_fill_n_aux<octave_value*, unsigned long, octave_value>(octave_value*, unsigned long, octave_value const&, __false_type)



The "Calls Per Ack" column is the number of "Calls" divided by 172,233*2.   It gives you hint about how many
times a function is executed as a result of each call to quack().  Some  numbers are scary-big, like the
number of times the octave_value destructor is called (104 Calls per ACK) or the Array<std::string> destructor
(29+ times per ACK).   I don't at all understand why so many octave_value and string array objects are being
created for so simple a program, they are not directly big chunks of time. Most of the time is pretty spread out 
among the classes unwind_protect::, symbol_table::, and tree_*::.    

I tried taking a look at the diff set between 2.9.5 and 2.9.6, but was not able to see anything that leapt out at me
as an obvious problem.

I then generated profiling octaves for all versions from 2.9.5 to 2.9.10 and ran the same application through.
Though the profiles differer in the order of what's on the top, the same cast of characters is present.

It is almost as if the cost of doing all of these varied operations just increased, not one method or one class.

Does anyone have any hints as to what might be going on or what can be done or tried?

--Luis Ortiz


 






reply via email to

[Prev in Thread] Current Thread [Next in Thread]