octave-maintainers
[Top][All Lists]

## The nanflag parameter

 From: Ștefan-Gabriel Mirea Subject: The nanflag parameter Date: Sat, 14 Jan 2017 04:31:46 +0200

```Hello,

My name is Stefan Mirea. I am a fourth year student of Automatic
Control and Computer Science at the "Politehnica" University of
Bucharest. As Octave has been one of my favourite tools during the
faculty years, I wish to contribute to its development.

I chose to start with bug #50007. Unfortunately, I don't think there
is any elegant fix that does not imply changing the Octave API. Since
my solution would involve modifying multiple function prototypes, I
decided to describe it below and ask for your opinion before working
on a patch.

According to the MATLAB documentation[1], the functions which accept a
nanflag argument are:
* min
* max
* cummin
* cummax
* sum
* cumsum
* mean
* median
* var
* std
* cov
* medfilt1

together with some recently introduced descriptive statistics
functions, which are not currently implemented in Octave: movsum,
movmean, movmedian, movmax, movmin, movstd, movvar.

For the one-array-input syntaxes of min and max, I would:

* modify the signature of the do_minmax_red_op() function in max.cc,
in order to receive the boolean nan-flag from do_minmax_body().
do_minmax_red_op() is never called from elsewhere. Unhappily, the two
template specializations, for charNDArray and boolNDArray
respectively, would need to be updated as well, even though these
types don't support NaN values.

* add a nan-flag parameter to the min() and max() methods of all the
classes with which do_minmax_red_op() is instantiated (except
boolNDArray): SparseMatrix, NDArray, SparseComplexMatrix,
ComplexNDArray, FloatNDArray, FloatComplexNDArray, charNDArray and
intNDArray<*> (yet charNDArray methods could be left unchanged as well
because of the specialization). Although these min/max methods are
part of the Octave API, using a default argument would ensure that no
external code would be affected. Inside the Octave core, they are only
called from max.cc. Also, I believe that uniformity towards the min()
and max() methods of other classes would not be a problem, as the
classes above are already the only ones whose min/max methods appear
in this form (receiving the dimension along which the reductions will
be performed; on the other hand, ColumnVector::min() has no parameters
for example).

* add a nan-flag parameter to mx_inline_min() and mx_inline_max().
These functions are used only in the min()/max() methods of the Array
(not Sparse) classes above. The overloads defined with OP_MINMAX_FCNN
would just pass the flag unchanged when calling the OP_MINMAX_FCN or
OP_MINMAX_FCN2 version.

* update do_mx_minmax_op() to receive the nan-flag from the caller
min()/max() method and send it to mx_minmax_op() (which now accepts
it). Again, do_mx_minmax_op() is used only in the min()/max() methods
of the Array classes above.

* update SparseMatrix::min()/max() as well as the versions of
mx_inline_min() and mx_inline_max() defined with OP_MINMAX_FCN and
OP_MINMAX_FCN2 to take the received NaN policy into account.

For the two-array-input syntax of min and max, I would:

* send the nan-flag from do_minmax_body() to do_minmax_bin_op()
similarly (even if it is not needed by the charNDArray
specialization).

* either find a general mechanism to tell the binary operations if NaN
values must be propagated (by altering do_sm_binary_op() /
do_ms_binary_op() / do_mm_binary_op()) or create alternative
mx_inline_xmin() / mx_inline_xmax() functions for the "includenan"
case (e.g. mx_inline_xmin_includenan). Since other mx_inline_*()
functions would not use this feature given the current MATLAB syntax,
I believe that the second alternative is much better.

* either:
a) add the nan-flag parameter to all the scalar-matrix /
matrix-scalar / matrix-matrix min() and max() functions. While, again,
this implies changing the Octave API, using a default argument can
keep external code functional. Inside the Octave core, these functions
are also called only from max.cc. Unfortunately, the Matrix,
FloatMatrix, ComplexMatrix and FloatComplexMatrix versions of min()
and max() should be also changed for consistency, although they would
never be called with the flag set (when calling min or max from the
interpreter). For NDArray, ComplexNDArray, FloatNDArray,
FloatComplexNDArray, intNDArray<*> and charNDArray, the min() / max()
functions (usually defined with MINMAX_FCNS, except in charNDArray)
would just check whether to pass mx_inline_x##FCN or
mx_inline_x##FCN##_includenan to do_*_binary_op() (assuming the
creation of the mx_inline_*_includenan functions). For SparseMatrix
and SparseComplexMatrix, the algorithm in min() and max() would be
updated to respect the flag.

or:
b) call do_sm/ms/mm_binary_op() directly from do_minmax_bin_op(),
avoiding the min() / max() functions. In addition to losing
modularity, SparseMatrix and SparseComplexMatrix would probably need

The situation for cummin and cummax is very similar to the
one-array-input min/max case (actually, I think that SparseMatrix and
SparseComplexMatrix should also have their own cummin() and cummax()
methods, because in MATLAB cummin and cummax return a sparse matrix
when called with a sparse argument).

Regarding the other functions, they are different in that the
currently implemented behaviour is equivalent to "includenan". Most of
them seem to have "omitnan" counterparts in the nan package, but, of
course, not in a manner compatible with MATLAB.

I would be thankful if you could give me some feedback on the approach
above.

Regards,
Stefan

```