[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: covariance matrix
From: |
Vic Norton |
Subject: |
Re: covariance matrix |
Date: |
Mon, 19 Feb 2007 17:49:21 -0500 |
# The Octave cov function is incorrect.
# Here is a definition with an example and references.
# Example
X = [
3 5 3 7 1 5
2 1 4 2 2 1
1 5 1 1 7 2
];
Y = [
1 1 4 2 6 1
3 1 2 2 1 1
2 1 3 1 1 5
];
x = X(:, 2); y = Y(:, 4);
# straight from the definition below
cov_xy = mean((x - ones(rows(x), 1) * mean(x)) .* (y - ones(rows(y), 1) *
mean(y)))
# the covariance matrix
n = rows(x); # = rows(y) also
Xdev = X - ones(n, 1) * mean(X); # deviations from the mean
Ydev = Y - ones(n, 1) * mean(Y); # deviations from the mean
cov_matrix = Xdev' * (Ydev/n)
# References
# <http://planetmath.org/encyclopedia/Covariance.html>
# <http://planetmath.org/encyclopedia/CovarianceMatrix.html>
#
# cov(x, y) = E[(x - E(x))(y - E(y))].
# The (i, j) entry of cov(X, Y) sould be cov(X(:, i), Y(:, j)).
On 2/19/07, at 1:33 PM -0500, John W. Eaton wrote:
> The Matlab documentation says that if cov is given two matrix
> arguments with the same number of elements, it computes
>
> cov ([X(:), Y(:)])
>
> (not quite what you have above). This expression will always compute
> a 2x2 matrix. Octave's cov function does this if X and Y are
> matrices:
>
> If each row of X and Y is an observation and each column is a
> variable, the (I, J)-th entry of `cov (X, Y)' is the covariance
> between the I-th variable in X and the J-th variable in Y.
>
> and this will compute an NxN matrix with N == columns(X) == columns(Y).
>
> I don't know why we have the difference. Has Matlab always behaved
> this way? If so, then I'm surprised that you are the first to
> notice. Hmm. Should we change Octave to be compatible? What will
> this break? Should we try to preserve the old behavior? If so, how?
>
> Hmm. It seems corrcoeff also has similar problems.
>
> Will someone who understands what should be happening here please fix
> cov and corrcoef and submit a patch?
>
> Thanks,
>
> jwe