Re: Changesets: Re: Parallel access to data created with Octave

octave-maintainers
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Changesets: Re: Parallel access to data created with Octave

From:	Jarno Rajahalme
Subject:	Re: Changesets: Re: Parallel access to data created with Octave
Date:	Wed, 26 May 2010 09:20:23 -0700
On May 25, 2010, at 11:55 PM, ext Jaroslav Hajek wrote:

> On Tue, May 25, 2010 at 9:05 PM, Jarno Rajahalme
> <address@hidden> wrote:
>> 
>> On May 25, 2010, at 12:43 AM, ext Jaroslav Hajek wrote:
>> 
>>> On Tue, May 25, 2010 at 6:16 AM, Jarno Rajahalme
>>> <address@hidden> wrote:
>>>> 
>>>> liboctave:
>>>> 
>>>> - Followed the Array.h practice of returning const ref from dims() 
>>>> (Sparse.h)
>>> 
>>> I'm not sure this is a good idea. It is a bit redundant to store a
>>> dim_vector in Sparse, because it can only have 2 dimensions, and we
>>> may want to optimize the storage in the future.
>>> 
>> 
>> Please note that I only changed the existing Sparse::dims () to return a 
>> const reference instead of the full object. I did not add any storage 
>> requirement anywhere! Also, the modified Sparse::dims () can still be used 
>> for copying the object itself, but now the copy is done after the dims() 
>> returns, not before, so the caller has a choice. I.e.:
>> 
>> // Create a local copy (implies reference counting, copy-on-write, etc.)
>> 
>>  dim_vector dv = dims();
>> 
>> // Create a const reference (no reference counting, cannot be modified)
>> 
>>  const dim_vector& dv = dims();
>> 
>> Without this change the latter use will ALSO create a temporary copy, so it 
>> has the same overhead and thread-unsafeness as the former one.
> 
> The storage requirement is implied. Suppose we decide, instead of
> storing a dim_vector in Sparse<T>, to store just the row dimension in
> order to conserve space (column dim can be inquired from SparseRep).
> What will dims() return a reference to? It will need to be changed
> back.
> 
>> 
>> Again, this exact change has been made in Array.h earlier, there is even a 
>> comment about it :-)
>> 
> 
> Yes, but still it was disputable. dim_vector was created for Array, so
> it will probably always be stored directly. In any case, I made this
> change relatively recently, and I may still change my mind about it.
> 

I was not aware of the need for preserving space for Sparse or SparseRep. But 
looking at SparseRep, it seems that it's count, nrows and ncols could be 
replaced by an array resembling the dim_vector's rep, or a dim_vector object 
itself. That way the extra dim_vector in the Sparse itself would not be needed 
any more? Or is it more complicated than that?

>> 
>> Maybe changing more code to use dims () directly (e.g. dims ().length () ), 
>> i.e. without using a local copy would be better in cases where the dv is 
>> actually used only once or twice.
> 
> Unless it's a performance critical place, people should use whatever
> approach seems more convenient.
> 

OK.

>>> 
>>>> - Changed dim_vector local variable to const ref when available 
>>>> (bitfcns.cc, data.cc, oct-map.cc, ov-base-diag.cc, ov-base-int.cc, 
>>>> ov-base-mat.cc, ov-base-sparse.cc, ov-bool-mat.cc, ov-bool-sparse.cc, 
>>>> ov-cell.cc, ov-cx-mat.cc, ov-cx-sparse.cc, ov-flt-cx-mat.cc, 
>>>> ov-flt-re-mat.cc, ov-intx.h, ov-perm.cc, ov-re-mat.cc, ov-re-sparse.cc, 
>>>> ov-str-mat.cc, ov-struct.cc, pr-output.cc, pt-mat.cc, strfns.cc, xpow.cc)
>>> 
>>> See above. The return-by-value style is idiomatic and convenient
>>> (because it doesn't matter whether the function returns a value or a
>>> reference), even if we make this change now, there's no guarantee
>>> people will keep using this style. I probably won't.
>>> 
>> 
>> First of all, the reference idiom also works regardless whether the function 
>> returns a value or a reference, and it has no penalty in either case.
> 
> No, it doesn't. It's OK if you return a const ref and assign it to a
> variable (because you just assign the referenced object) and it's OK
> if you return an object and make a const reference to it (because C++
> holds the temporary for the duration of the const reference, but some
> people don't know this). But, whenever a reference is returned from a
> function (or, for that matter, passed to it), it is solely the
> programmer's responsibility to ensure that the reference remains valid
> as long as it is used. The compiler won't guarantee this because it
> has no way to do so. Consider this snippet:
> 
> const dim_vector& orig_dims = matrix.dims (); // matrix is Array<T> subclass
> ...
> matrix.resize (new_dims); // resize to new dimensions
> ...
> matrix.resize (orig_dims); // gotcha!
> 
> the problem is that at the second line, the dim_vector object
> referenced by orig_dims silently changes, so the final resize will do
> something unexpected (do nothing).
> 
> Another simple instance is (disregarding the fact that it makes little sense):
> 
> octave_value val = ...
> const dim_vector& dv = val.array_value ().dims ();
> 
> here, val.array_value () creates a temporary NDArray object, whose
> dims () method is the called. This method returns a reference, so no
> temporary is withheld. C++ has no way of knowing that it needs to hold
> the intermediate expressions, too. It can only hold the final one. So
> after this statement, the temporary NDArray object is destroyed, but
> the reference still points to the temporary object. Ouch.
> 
> Yes, both examples seem somewhat artificial and probably can be
> avoided easily, but there may be less obvious cases. And with the
> other reference counted classes, it may be even easier to shoot
> yourself in the foot.
> 
> 
>> So, people don't need to, the code works either way. I did change all the 
>> applicable cases, though, so as long as people cut & paste within the 
>> codeset, the practice might as well stick, who knows. Anyway, it IS more 
>> efficient and safe.
> 
> More efficient, yes (at least usually). Safe, no. I have no problem
> with using this, especially where performance seems critical, but I'm
> against making the changes globally.
> 
> 
> "usually" is exact because a reference is ultimately just a pointer
> that is usually optimized away; usually, but not always. It can only
> be optimized away when the compiler can determine that the
> const-referenced object does not change (by means other than the const
> reference). The compiler has to assume the worst here. Fortran is
> better(?) in this regard because it simply declares all such programs
> illegal and so the compiler needs not care :)
> 
> If the caller tries to modify the dv, the const ref has to be changed
> to a non-const local instance and everything is OK.


Agree. I would conclude that it is safe to use "const dim_vector&" whenever it 
is taken from an object that is passed in as a const & parameter (which means 
that the object must not be changed, not even it's dim_vector). Also, one-off 
use (e.g. dims ().length ()) is also safe, but actually storing the reference 
in other cases should be done only in performance critical cases, when it can 
be maintained that the underlying dim_vector does not change.

Currently, it seems that reference counting is not an performance issue, but 
this might change, if/when the reference counting is made thread-safe (see the 
other thread).

>>>> - Changed to avoid calling dims() or numel() via the virtual members 
>>>> (ov-base-diag.cc, ov-base-mat.cc, ov-base-sparse.cc, ov-bool-mat.cc, 
>>>> ov-bool-mat.h, ov-bool-sparse.cc, ov-cell.cc, ov-cx-mat.cc, 
>>>> ov-cx-sparse.cc, ov-flt-cx-mat.cc, ov-flt-re-mat.cc, ov-intx.h, 
>>>> ov-perm.cc, ov-range.cc, ov-re-mat.cc, ov-re-sparse.cc, ov-str-mat.cc, 
>>>> ov-struct.cc)
>>>> 
>>> 
>>> Except in supposedly performance-critical places, I'd rather avoid
>>> doing this. Pieces of code are copied often between subclass
>>> implementations, and this change makes it harder.
>> 
>> In my (very recent) experience with octave code, most of the subclasses use 
>> the "matrix" member, so there should be a problem. Also, I only followed the 
>> practice already used in parts of the code.
> 
> I don't think so. I used to change the virtual calls to direct calls
> in certain places where I thought it could help, but there's a number
> of, say, octave_matrix:: methods that just call dims() or numel ().
> Check for yourself.
> 

I was not clear enough. I meant that most of the time, the calls are in object 
that have a "matrix" member, so using it is not a big barrier for cur&paste 
coding.

>> 
>> Let me elaborate the reasoning here:
>> 
>> - Classes derived from Array have access to a const reference to the 
>> dim_vector via Array::dims (). The subclasses can use the dims () method for 
>> accessing the dim_vector without any performance penalty, and in a thread 
>> safe manner. All use of dims () can be inlined, etc.
> 
> Yes, this was the reason why I changed Array<T>::dims. Originally it
> returned a value.
> 
>> - Classes derived from octave_base have dims () method that returns 
>> dim_vector object, that in some cases is newly constructed, and always 
>> copied, involving (in many cases unnecessary) reference counting. This dims 
>> () being virtual, it can never be inlined, as the compiler will not know at 
>> the compile time which implementation will be used.
>> 
>> So the latter has performance penalties via:
>> 1) dim_vector constructor/destructor
>> 2) dim_vector::rep reference counting
>> 3) no inlining
>> 4) mandatory dispatch via vtable, many times canceling the benefit of branch 
>> prediction, which can double the execution time
>> 5) not thread safe
>> 
> 
> Yes, but you can't avoid the virtual methods normally,

But it gets even worse when there are virtual methods calling each other. None 
of them can be inlined.

> 
> I think you've gone much farther than suitable with your changes. I'm
> against implementing octave_fcn_handle::rows et al just for the sake
> of preventing octave_value::numel doing reference counting behind the
> hood. As I said, I'm OK having the overrides for scalars and matrices,
> but not everywhere.
> 


I'll make an another proposal along the lines of this discussion in a few days.

>>>> 
>>> 
>>> Yes. The point is that dim_vector is just one class that is reference
>>> counted, there are many more - idx_vector, Array, Sparse, SparseQR,
>>> octave_shlib, octave_mutex, octave_value, graphics properties ...
>>> 
>> 
>> There is a subtle, but important difference: none of the above are needed 
>> for just *reading* the octave data, while in practice either numel(), 
>> columns(), rows() or capacity() is needed to know how much data there is.
> 
> No, it isn't. Just like you reference the octave_base_matrix::matrix
> members directly through pointer tricks, you can reference
> octave_base_matrix::matrix::dims (), and it's thread-safe (thanks to
> the const-ref being returned). octave_value::numel() is *not* a safe
> way to determine what polymorphic instance you have, and never will
> be. Admittedly, if Array::dims didn't return a const ref, you would
> have no thread-safe way to get there, so it's good this has changed,
> but I think it's not needed to go much farther.
> 

Could you please show how I get from a octave_base_value * to the length of the 
data without using numel ?

>> 
>>>> There are minor efficiency increases all over the place, due to not 
>>>> creating a local instance of dim_vector, where it is not needed.
>>> 
>>> The difference between const dim_vector and const dim_vector& is just
>>> the counter increment/decrement, and I doubt that is visible in most
>>> places.
>>> 
>> 
>> Right. I totally agree that when printing, saving, etc. the overhead is 
>> totally absorbed by orders of magnitude bigger overheads. But maybe keeping 
>> the changes I proposed would entice others later using the same idioms, when 
>> creating new code that might be more performance critical (cut & paste)?
> 
> Cf. above arguments against adopting this as idiomatic. It's OK if it
> feels natural to you and you can avoid the subtle problems (I suppose
> you can easily), but otherwise I'm against "fixing" this in others'
> code. Also, note that JWE often makes "style fixes" to the code and
> he's even much less concerned by performance than me, so if you use a
> const reference on purpose for performance reasons, better add an
> explanation comment (such as the one for Array<T>::dims) to prevent
> someone else from changing it later.
> 

OK.

>> 
>> Btw. I also noticed that cellfun() itself has some overhead. My own 
>> "celltest.cc", which copied only the relevant parts of cellfun() was faster.
> 
> Hmm. Maybe you omitted something that cellfun needs to do? If you
> don't think so, please show me the code. Or if you can figure out how
> cellfun can be further sped up (preferably without invasive changes),
> that would be even better :)
> 

I just cut&pasted the cellfun itself, but only upto the point where the arg0 is 
a string recognized by cellfun (e.g., "numel") and without any of the preceding 
C++ helper code.

  Jarno
[Prev in Thread]
Current Thread
[Next in Thread]
Re: Changesets: Re: Parallel access to data created with Octave, (continued)
Prev by Date: Re: Thread-safe reference counting (modified dim-vector.h)
Next by Date: Re: Thread-safe reference counting (modified dim-vector.h)
Previous by thread: Re: Changesets: Re: Parallel access to data created with Octave
Next by thread: Re: Changesets: Re: Parallel access to data created with Octave
Index(es):
- Date
- Thread