octave-maintainers
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Five other functions that are in Matlab core ported to Octave


From: David Bateman
Subject: Re: Five other functions that are in Matlab core ported to Octave
Date: Thu, 06 Sep 2007 23:04:33 +0200
User-agent: Thunderbird 1.5.0.7 (X11/20060921)

John W. Eaton wrote:
>
> I agree that there is probably a lot of room for improvementhere, but
> any change to subsasgn should wait until the object branch is merged
> (after 3.0).

Well, if there is some minimal changes we might do that speed things up
 we should probably try. I compiled a version of octave for use with
gprof and ran


m = 1000; n = 1000;
col = ones (m, 1);
a = zeros(m, n);
for i = 1: n;
  a(:,i) = col;
end
quit

and then did a flat profile on it. The top consuming functions are


Flat profile:

Each sample counts as 0.01 seconds.
  %   cumulative   self              self     total
 time   seconds   seconds    calls  ms/call  ms/call  name
 25.00      0.20     0.20  1000000     0.00     0.00
Array<double>::elem(int, i
nt)
 21.25      0.37     0.17     4958     0.03     0.03
file_ops::tilde_expand(std
::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)
 13.75      0.48     0.11     1000     0.11     0.31  int
assignN<double, double
>(Array<double>&, Array<double> const&, double const&)
 13.75      0.59     0.11        2    55.00    55.01
fill_matrix(octave_value_l
ist const&, int, char const*)
  2.50      0.61     0.02        2    10.00    10.00  Array<octave_value
(*)(oct
ave_base_value&, octave_value_list const&, octave_base_value
const&)>::resize_an
d_fill(int, int, int, octave_value (* const&)(octave_base_value&,
octave_value_l
ist const&, octave_base_value const&))


The top call is Array<double>::elem !!!! The next is associated with the
path search at startup and the third call is the assign function itself.
Notice that ::assignN and not ::assign2 is called.

Looking at the ::assign functions they call lhs.elem(...). If we instead
call lhs.make_unique() once and then use lhs.xelem(...) we can already
get a significant improvement. However Array<T>::make_unique is
protected and so to support g++ 3.2 compilers it would have to be made
public rather than make the ::assign functions a friend of Array<T>.

Looking further there is code like

                  for (octave_idx_type i = 0; i < n; i++)
                    {
                      octave_idx_type ii = lhs_idx.elem (i);
                      lhs.xelem (ii) = rhs.elem (i);
                    }

that might be written as

              if (lhs_idx.is_colon ())
                {
                  for (octave_idx_type i = 0; i < n; i++)
                    lhs.xelem (i) = rhs.elem (i);
                }
              else
                {
                  for (octave_idx_type i = 0; i < n; i++)
                    {
                      octave_idx_type ii = lhs_idx.elem (i);
                      lhs.xelem (ii) = rhs.elem (i);
                    }
or even

                      for (octave_idx_type j = 0; j < m; j++)
                        {
                          octave_idx_type jj = idx_j.elem (j);
                          for (octave_idx_type i = 0; i < n; i++)
                            {
                              octave_idx_type ii = idx_i.elem (i);
                              lhs.xelem (ii, jj) = xrhs.elem (k++);
                            }
                        }

that might be written as

                      for (octave_idx_type j = 0; j < m; j++)
                        {
                          octave_idx_type off = dims(0) *
                                idx_j.elem (j);
                          for (octave_idx_type i = 0; i < n; i++)
                            {
                              octave_idx_type ii = idx_i.elem (i);
                              lhs.xelem (ii + off) = xrhs.elem (k++);
                            }
                        }

though it makes no sense to do that in ::assign2 as I'll address later.
Making these changes and then running the attached script with an
unaltered version of Octave I get

[t1,t2] = testsubsassgn([100,200,500,1000],[100,200,500,1000],5)

t1 =

   4.0000e-04   8.0000e-04   1.7998e-03   3.5996e-03
   4.0000e-04   9.9980e-04   1.9996e-03   4.1992e-03
   5.9980e-04   1.3996e-03   2.9994e-03   5.9990e-03
   7.9980e-04   1.7998e-03   4.5994e-03   8.5988e-03

t2 =

   8.0000e-04   1.7996e-03   4.1994e-03   8.7988e-03
   1.3998e-03   2.7996e-03   7.1988e-03   1.4398e-02
   2.9996e-03   5.9992e-03   1.5398e-02   3.0595e-02
   5.7992e-03   1.1598e-02   2.8796e-02   5.7591e-02

with the attached patch I get

[t1,t2] = testsubsassgn([100,200,500,1000],[100,200,500,1000],5)
t1 =

   2.0000e-04   8.0000e-04   1.5996e-03   3.3994e-03
   4.0000e-04   7.9980e-04   1.7996e-03   3.5994e-03
   4.0000e-04   5.9980e-04   1.9996e-03   4.3994e-03
   4.0000e-04   9.9980e-04   2.7996e-03   5.7992e-03

t2 =

   7.9980e-04   1.7996e-03   4.1994e-03   8.5988e-03
   1.3998e-03   2.9996e-03   6.9990e-03   1.4398e-02
   2.9994e-03   6.3990e-03   1.5398e-02   3.0395e-02
   5.7990e-03   1.1598e-02   2.8596e-02   5.7991e-02


the times are in seconds for 5 runs and 100x100, 100x200, ... 500x1000,
1000x1000 matrix constructed like "for i = 1:n, a(:,i) = ...; end" for
the first and "col=[1:n]"; for i = 1:n, a(col,i) = ...; end;" for the
second. The reason for the second test is to see the cost of the test
for is_colon on the cases where the index isn't a colon.


For the colon case the speed can be enormous.. Dividing one by the other
gives

   0.50000   1.00000   0.88877   0.94438
   1.00000   0.79996   0.89998   0.85716
   0.66689   0.42855   0.66667   0.73336
   0.50013   0.55551   0.60869   0.67442

The numerically indexed case has a few speed-ups, but some small
slow-downs as well. Dividing the t2 values give.

   0.99975   1.00000   1.00000   0.97727
   1.00000   1.07144   0.97225   1.00001
   0.99993   1.06664   1.00003   0.99345
   0.99997   0.99998   0.99307   1.00694

so the numerical indexing is basically identical and the differences are
probably just measurement error. Given that the "a(:,i)" indexing case
is really very common and that there can be almost a doubling in speed I
think the attached patch should probably be applied.. I also ran "make
check" with no issues with the attached change (sorry no changelog,
though will create one later)..

Another thing I noticed is that in ::assign there is the code

  return (lhs.ndims () == 2
          && (n_idx == 1
              || (n_idx < 3
                  && rhs.ndims () == 2
                  && rhs.rows () == 0 && rhs.columns () == 0)))
    ? assign2 (lhs, rhs, rfv) : assignN (lhs, rhs, rfv);

essentially assign2 is only called 2-D matrices with empty matrices for
the RHS or for 2-D matrices indexed with 1 index vector. The first some
case is trivial and just calls maybe_delete_elements, which is also in
assignN. So the above might be simplified to

return (lhs.dims () == 2 && n_idx == 1) ? assign2 (lhs, rhs, rfv) :
                        assignN (lhs, rhs, rfv);

in fact then perhaps assign2 might be significantly simplified, though
I'd rather not do that just before a 3.0 release. In any case the above
change makes no difference on the speed.

D.
Index: liboctave/Array.cc
===================================================================
RCS file: /usr/local/cvsroot/octave/liboctave/Array.cc,v
retrieving revision 1.159
diff -c -r1.159 Array.cc
*** liboctave/Array.cc  6 Sep 2007 12:08:44 -0000       1.159
--- liboctave/Array.cc  6 Sep 2007 20:59:53 -0000
***************
*** 1355,1361 ****
                    iidx++;
                  else
                    {
!                     new_data[ii] = elem (i);
                      ii++;
                    }
                }
--- 1355,1361 ----
                    iidx++;
                  else
                    {
!                     new_data[ii] = xelem (i);
                      ii++;
                    }
                }
***************
*** 1441,1447 ****
                iidx++;
              else
                {
!                 new_data[ii] = elem (i);
  
                  ii++;
                }
--- 1441,1447 ----
                iidx++;
              else
                {
!                 new_data[ii] = xelem (i);
  
                  ii++;
                }
***************
*** 1558,1564 ****
                          else
                            {
                              for (octave_idx_type i = 0; i < nr; i++)
!                               new_data[nr*jj+i] = elem (i, j);
                              jj++;
                            }
                        }
--- 1558,1564 ----
                          else
                            {
                              for (octave_idx_type i = 0; i < nr; i++)
!                               new_data[nr*jj+i] = xelem (i, j);
                              jj++;
                            }
                        }
***************
*** 1621,1627 ****
                          else
                            {
                              for (octave_idx_type j = 0; j < nc; j++)
!                               new_data[new_nr*j+ii] = elem (i, j);
                              ii++;
                            }
                        }
--- 1621,1627 ----
                          else
                            {
                              for (octave_idx_type j = 0; j < nc; j++)
!                               new_data[new_nr*j+ii] = xelem (i, j);
                              ii++;
                            }
                        }
***************
*** 1919,1925 ****
                              octave_idx_type kidx
                                = ::compute_index (temp_result_idx, 
new_lhs_dim);
  
!                             new_data[kidx] = elem (result_idx);
                            }
  
                          increment_index (result_idx, lhs_dims);
--- 1919,1925 ----
                              octave_idx_type kidx
                                = ::compute_index (temp_result_idx, 
new_lhs_dim);
  
!                             new_data[kidx] = xelem (result_idx);
                            }
  
                          increment_index (result_idx, lhs_dims);
***************
*** 1977,1983 ****
                    }
                  else
                    {
!                     new_data[ii++] = elem (lhs_ra_idx);
                    }
  
                  increment_index (lhs_ra_idx, lhs_dims);
--- 1977,1983 ----
                    }
                  else
                    {
!                     new_data[ii++] = xelem (lhs_ra_idx);
                    }
  
                  increment_index (lhs_ra_idx, lhs_dims);
***************
*** 2506,2511 ****
--- 2506,2513 ----
        }
        else
        {
+         lhs.make_unique ();
+ 
          if (rhs_len == n || rhs_len == 1)
            {
              octave_idx_type max_idx = lhs_idx.max () + 1;
***************
*** 2515,2534 ****
  
          if (rhs_len == n)
            {
!             for (octave_idx_type i = 0; i < n; i++)
                {
!                 octave_idx_type ii = lhs_idx.elem (i);
!                 lhs.elem (ii) = rhs.elem (i);
                }
            }
          else if (rhs_len == 1)
            {
              RT scalar = rhs.elem (0);
  
!             for (octave_idx_type i = 0; i < n; i++)
                {
!                 octave_idx_type ii = lhs_idx.elem (i);
!                 lhs.elem (ii) = scalar;
                }
            }
          else
--- 2517,2552 ----
  
          if (rhs_len == n)
            {
!             if (lhs_idx.is_colon ())
                {
!                 for (octave_idx_type i = 0; i < n; i++)
!                   lhs.xelem (i) = rhs.elem (i);
!               }
!             else
!               {
!                 for (octave_idx_type i = 0; i < n; i++)
!                   {
!                     octave_idx_type ii = lhs_idx.elem (i);
!                     lhs.xelem (ii) = rhs.elem (i);
!                   }
                }
            }
          else if (rhs_len == 1)
            {
              RT scalar = rhs.elem (0);
  
!             if (lhs_idx.is_colon ())
                {
!                 for (octave_idx_type i = 0; i < n; i++)
!                   lhs.xelem (i) = scalar;
!               }
!             else
!               {
!                 for (octave_idx_type i = 0; i < n; i++)
!                   {
!                     octave_idx_type ii = lhs_idx.elem (i);
!                     lhs.xelem (ii) = scalar;
!                   }
                }
            }
          else
***************
*** 2546,2555 ****
  
        if (lhs_dims.all_zero ())
        {
          lhs.resize_no_fill (rhs_len);
  
          for (octave_idx_type i = 0; i < rhs_len; i++)
!           lhs.elem (i) = rhs.elem (i);
        }
        else if (rhs_len != lhs_len)
        (*current_liboctave_error_handler)
--- 2564,2575 ----
  
        if (lhs_dims.all_zero ())
        {
+         lhs.make_unique ();
+ 
          lhs.resize_no_fill (rhs_len);
  
          for (octave_idx_type i = 0; i < rhs_len; i++)
!           lhs.xelem (i) = rhs.elem (i);
        }
        else if (rhs_len != lhs_len)
        (*current_liboctave_error_handler)
***************
*** 2669,2674 ****
--- 2689,2696 ----
  
                  if (n > 0 && m > 0)
                    {
+                     lhs.make_unique ();
+ 
                      MAYBE_RESIZE_LHS;
  
                      RT scalar = xrhs.elem (0, 0);
***************
*** 2679,2685 ****
                          for (octave_idx_type i = 0; i < n; i++)
                            {
                              octave_idx_type ii = idx_i.elem (i);
!                             lhs.elem (ii, jj) = scalar;
                            }
                        }
                    }
--- 2701,2707 ----
                          for (octave_idx_type i = 0; i < n; i++)
                            {
                              octave_idx_type ii = idx_i.elem (i);
!                             lhs.xelem (ii, jj) = scalar;
                            }
                        }
                    }
***************
*** 2688,2693 ****
--- 2710,2717 ----
                       && (rhs_nr == 1 || rhs_nc == 1)
                       && n * m == rhs_nr * rhs_nc)
                {
+                 lhs.make_unique ();
+ 
                  MAYBE_RESIZE_LHS;
  
                  if (n > 0 && m > 0)
***************
*** 2700,2712 ****
                          for (octave_idx_type i = 0; i < n; i++)
                            {
                              octave_idx_type ii = idx_i.elem (i);
!                             lhs.elem (ii, jj) = xrhs.elem (k++);
                            }
                        }
                    }
                }
              else if (n == rhs_nr && m == rhs_nc)
                {
                  MAYBE_RESIZE_LHS;
  
                  if (n > 0 && m > 0)
--- 2724,2738 ----
                          for (octave_idx_type i = 0; i < n; i++)
                            {
                              octave_idx_type ii = idx_i.elem (i);
!                             lhs.xelem (ii, jj) = xrhs.elem (k++);
                            }
                        }
                    }
                }
              else if (n == rhs_nr && m == rhs_nc)
                {
+                 lhs.make_unique ();
+ 
                  MAYBE_RESIZE_LHS;
  
                  if (n > 0 && m > 0)
***************
*** 2717,2723 ****
                          for (octave_idx_type i = 0; i < n; i++)
                            {
                              octave_idx_type ii = idx_i.elem (i);
!                             lhs.elem (ii, jj) = xrhs.elem (i, j);
                            }
                        }
                    }
--- 2743,2749 ----
                          for (octave_idx_type i = 0; i < n; i++)
                            {
                              octave_idx_type ii = idx_i.elem (i);
!                             lhs.xelem (ii, jj) = xrhs.elem (i, j);
                            }
                        }
                    }
***************
*** 2862,2881 ****
                }
              else if (len == rhs_nr * rhs_nc)
                {
!                 for (octave_idx_type i = 0; i < len; i++)
                    {
!                     octave_idx_type ii = idx_i.elem (i);
!                     lhs.elem (ii) = xrhs.elem (i);
                    }
                }
              else if (rhs_is_scalar)
                {
                  RT scalar = rhs.elem (0, 0);
  
!                 for (octave_idx_type i = 0; i < len; i++)
                    {
!                     octave_idx_type ii = idx_i.elem (i);
!                     lhs.elem (ii) = scalar;
                    }
                }
              else
--- 2888,2927 ----
                }
              else if (len == rhs_nr * rhs_nc)
                {
!                 lhs.make_unique ();
! 
!                 if (idx_i.is_colon ())
!                   {
!                     for (octave_idx_type i = 0; i < len; i++)
!                       lhs.xelem (i) = xrhs.elem (i);
!                   }
!                 else
                    {
!                     for (octave_idx_type i = 0; i < len; i++)
!                       {
!                         octave_idx_type ii = idx_i.elem (i);
!                         lhs.xelem (ii) = xrhs.elem (i);
!                       }
                    }
                }
              else if (rhs_is_scalar)
                {
+                 lhs.make_unique ();
+ 
                  RT scalar = rhs.elem (0, 0);
  
!                 if (idx_i.is_colon ())
                    {
!                     for (octave_idx_type i = 0; i < len; i++)
!                       lhs.xelem (i) = scalar;
!                   }
!                 else
!                   {
!                     for (octave_idx_type i = 0; i < len; i++)
!                       {
!                         octave_idx_type ii = idx_i.elem (i);
!                         lhs.xelem (ii) = scalar;
!                       }
                    }
                }
              else
***************
*** 2934,2941 ****
    else if (n_idx == 1)
      {
        idx_vector iidx = idx(0);
  
!       if (! (iidx.is_colon ()
             || (iidx.one_zero_only ()
                 && iidx.orig_dimensions () == lhs.dims ())))
        (*current_liboctave_warning_with_id_handler)
--- 2980,2988 ----
    else if (n_idx == 1)
      {
        idx_vector iidx = idx(0);
+       int iidx_is_colon = iidx.is_colon ();
  
!       if (! (iidx_is_colon
             || (iidx.one_zero_only ()
                 && iidx.orig_dimensions () == lhs.dims ())))
        (*current_liboctave_warning_with_id_handler)
***************
*** 2959,2980 ****
            }
          else if (len == rhs.length ())
            {
!             for (octave_idx_type i = 0; i < len; i++)
                {
!                 octave_idx_type ii = iidx.elem (i);
  
!                 lhs.elem (ii) = rhs.elem (i);
                }
            }
          else if (rhs_is_scalar)
            {
              RT scalar = rhs.elem (0);
  
!             for (octave_idx_type i = 0; i < len; i++)
                {
!                 octave_idx_type ii = iidx.elem (i);
  
!                 lhs.elem (ii) = scalar;
                }
            }
          else
--- 3006,3047 ----
            }
          else if (len == rhs.length ())
            {
!             lhs.make_unique ();
! 
!             if (iidx_is_colon)
                {
!                 for (octave_idx_type i = 0; i < len; i++)
!                   lhs.xelem (i) = rhs.elem (i);
!               }
!             else
!               {
!                 for (octave_idx_type i = 0; i < len; i++)
!                   {
!                     octave_idx_type ii = iidx.elem (i);
  
!                     lhs.xelem (ii) = rhs.elem (i);
!                   }
                }
            }
          else if (rhs_is_scalar)
            {
              RT scalar = rhs.elem (0);
  
!             lhs.make_unique ();
! 
!             if (iidx_is_colon)
                {
!                 for (octave_idx_type i = 0; i < len; i++)
!                   lhs.xelem (i) = scalar;
!               }
!             else
!               {
!                 for (octave_idx_type i = 0; i < len; i++)
!                   {
!                     octave_idx_type ii = iidx.elem (i);
  
!                     lhs.xelem (ii) = scalar;
!                   }
                }
            }
          else
***************
*** 3131,3136 ****
--- 3198,3205 ----
  
              if (rhs_is_scalar)
                {
+                 lhs.make_unique ();
+ 
                  if (n_idx < orig_lhs_dims_len)
                    lhs = lhs.reshape (lhs_dims);
  
***************
*** 3146,3156 ****
  
                          octave_idx_type len = frozen_len(0);
  
!                         for (octave_idx_type i = 0; i < len; i++)
                            {
!                             octave_idx_type ii = iidx.elem (i);
  
!                             lhs.elem (ii) = scalar;
                            }
                        }
                      else if (lhs_dims_len == 2 && n_idx == 2)
--- 3215,3233 ----
  
                          octave_idx_type len = frozen_len(0);
  
!                         if (iidx.is_colon ())
                            {
!                             for (octave_idx_type i = 0; i < len; i++)
!                               lhs.xelem (i) = scalar;
!                           }
!                         else
!                           {
!                             for (octave_idx_type i = 0; i < len; i++)
!                               {
!                                 octave_idx_type ii = iidx.elem (i);
  
!                                 lhs.xelem (ii) = scalar;
!                               }
                            }
                        }
                      else if (lhs_dims_len == 2 && n_idx == 2)
***************
*** 3161,3173 ****
                          octave_idx_type i_len = frozen_len(0);
                          octave_idx_type j_len = frozen_len(1);
  
!                         for (octave_idx_type j = 0; j < j_len; j++)
                            {
!                             octave_idx_type jj = idx_j.elem (j);
!                             for (octave_idx_type i = 0; i < i_len; i++)
                                {
!                                 octave_idx_type ii = idx_i.elem (i);
!                                 lhs.elem (ii, jj) = scalar;
                                }
                            }
                        }
--- 3238,3264 ----
                          octave_idx_type i_len = frozen_len(0);
                          octave_idx_type j_len = frozen_len(1);
  
!                         if (idx_i.is_colon())
                            {
!                             for (octave_idx_type j = 0; j < j_len; j++)
                                {
!                                 octave_idx_type off = new_dims (0) *
!                                   idx_j.elem (j);
!                                 for (octave_idx_type i = 0; i < i_len; i++)
!                                   lhs.xelem (i + off) = scalar;
!                               }
!                           }
!                         else
!                           {
!                             for (octave_idx_type j = 0; j < j_len; j++)
!                               {
!                                 octave_idx_type off = new_dims (0) *
!                                   idx_j.elem (j);
!                                 for (octave_idx_type i = 0; i < i_len; i++)
!                                   {
!                                     octave_idx_type ii = idx_i.elem (i);
!                                     lhs.xelem (ii + off) = scalar;
!                                   }
                                }
                            }
                        }
***************
*** 3181,3187 ****
                            {
                              Array<octave_idx_type> elt_idx = get_elt_idx 
(idx, result_idx);
  
!                             lhs.elem (elt_idx) = scalar;
  
                              increment_index (result_idx, frozen_len);
                            }
--- 3272,3278 ----
                            {
                              Array<octave_idx_type> elt_idx = get_elt_idx 
(idx, result_idx);
  
!                             lhs.xelem (elt_idx) = scalar;
  
                              increment_index (result_idx, frozen_len);
                            }
***************
*** 3203,3208 ****
--- 3294,3301 ----
                    }
                  else
                    {
+                     lhs.make_unique ();
+ 
                      if (n_idx < orig_lhs_dims_len)
                        lhs = lhs.reshape (lhs_dims);
  
***************
*** 3216,3226 ****
  
                              octave_idx_type len = frozen_len(0);
  
!                             for (octave_idx_type i = 0; i < len; i++)
                                {
!                                 octave_idx_type ii = iidx.elem (i);
  
!                                 lhs.elem (ii) = rhs.elem (i);
                                }
                            }
                          else if (lhs_dims_len == 2 && n_idx == 2)
--- 3309,3327 ----
  
                              octave_idx_type len = frozen_len(0);
  
!                             if (iidx.is_colon ())
                                {
!                                 for (octave_idx_type i = 0; i < len; i++)
!                                   lhs.xelem (i) = rhs.elem (i);
!                               }
!                             else
!                               {
!                                 for (octave_idx_type i = 0; i < len; i++)
!                                   {
!                                     octave_idx_type ii = iidx.elem (i);
  
!                                     lhs.xelem (ii) = rhs.elem (i);
!                                   }
                                }
                            }
                          else if (lhs_dims_len == 2 && n_idx == 2)
***************
*** 3232,3246 ****
                              octave_idx_type j_len = frozen_len(1);
                              octave_idx_type k = 0;
  
!                             for (octave_idx_type j = 0; j < j_len; j++)
                                {
!                                 octave_idx_type jj = idx_j.elem (j);
!                                 for (octave_idx_type i = 0; i < i_len; i++)
                                    {
!                                     octave_idx_type ii = idx_i.elem (i);
!                                     lhs.elem (ii, jj) = rhs.elem (k++);
                                    }
                                }
                            }
                          else
                            {
--- 3333,3363 ----
                              octave_idx_type j_len = frozen_len(1);
                              octave_idx_type k = 0;
  
!                             if (idx_i.is_colon())
                                {
!                                 for (octave_idx_type j = 0; j < j_len; j++)
                                    {
!                                     octave_idx_type off = new_dims (0) * 
!                                       idx_j.elem (j);
!                                     for (octave_idx_type i = 0; 
!                                          i < i_len; i++)
!                                       lhs.xelem (i + off) = rhs.elem (k++);
                                    }
                                }
+                             else
+                               {
+                                 for (octave_idx_type j = 0; j < j_len; j++)
+                                   {
+                                     octave_idx_type off = new_dims (0) * 
+                                       idx_j.elem (j);
+                                     for (octave_idx_type i = 0; i < i_len; 
i++)
+                                       {
+                                         octave_idx_type ii = idx_i.elem (i);
+                                         lhs.xelem (ii + off) = rhs.elem (k++);
+                                       }
+                                   }
+                               }
+ 
                            }
                          else
                            {
***************
*** 3252,3258 ****
                                {
                                  Array<octave_idx_type> elt_idx = get_elt_idx 
(idx, result_idx);
  
!                                 lhs.elem (elt_idx) = rhs.elem (i);
  
                                  increment_index (result_idx, frozen_len);
                                }
--- 3369,3375 ----
                                {
                                  Array<octave_idx_type> elt_idx = get_elt_idx 
(idx, result_idx);
  
!                                 lhs.xelem (elt_idx) = rhs.elem (i);
  
                                  increment_index (result_idx, frozen_len);
                                }
Index: liboctave/Array.h
===================================================================
RCS file: /usr/local/cvsroot/octave/liboctave/Array.h,v
retrieving revision 1.100
diff -c -r1.100 Array.h
*** liboctave/Array.h   6 Sep 2007 12:08:45 -0000       1.100
--- liboctave/Array.h   6 Sep 2007 20:59:53 -0000
***************
*** 110,115 ****
--- 110,120 ----
  
    //--------------------------------------------------------------------
  
+ public:
+ 
+   // !!! WARNING !!! -- these should be protected, not public.  You
+   // should not access these methods directly!
+ 
    void make_unique (void)
      {
        if (rep->count > 1)
***************
*** 130,137 ****
        rep->fill (val);
      }
  
- public:
- 
    typedef T element_type;
  
    // !!! WARNING !!! -- these should be protected, not public.  You
--- 135,140 ----
function [t1, t2] = testsubsassgn (m, n, nrun)
  if (nargin < 3)
    nrun = 3;
  endif
  if (nargin < 2)
    n = [10, 100];
  endif
  if (nargin < 1)
    m = [10, 100];
  endif
  tcat = zeros (length(m), length(n));
  tsubs = zeros (length(m), length(n));
  for i = 1 : length(m)
    col = ones (m(i), 1);
    col2 = [1:m(i)];
    for j = 1 : length(n)
      for k = 1 : nrun
        b = zeros (m(i), n(j));
        t0 = cputime ();
        for l = 1 : n (j)
          b(:,l) = col;
        endfor
        t1 (i, j) = cputime () - t0;
        b = zeros(m(i), n(j));
        t0 = cputime();
        for l = 1 : n (j)
          b(col2,l) = col;
        endfor
        t2 (i, j) = cputime () - t0;
      endfor
      t1 (i, j) /= nrun;
      t2 (i, j) /= nrun;
    endfor
  endfor
endfunction

reply via email to

[Prev in Thread] Current Thread [Next in Thread]