help-octave
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

## Principal Components Analysis

 From: John W. Eaton Subject: Principal Components Analysis Date: Wed, 27 Jan 1999 15:31:02 -0600 (CST)

```On 27-Jan-1999, Gordon Haverland <address@hidden> wrote:

|   Maybe someone here can help me.  One of my users wants to
| draw ellipses around the centroids of her clusters of data
| points from Principal Components Analysis.  I can force
| everything to work as I expect, but I don't understand
| some of the why of what I am doing.  I have been reading
| Numerical Recipes and Matrix Computation (3rd edition).
|
|   So, PCA does an SVD of the data to find eigenvalues
| and eigenvectors of the data set and we ignore
| eigenvalues (and associated vectors) with values
| less than 1.  And the combination of eigenvalue and
| eigenvector defines a hyper-ellipsoid.  The eigenvalues are
| equal to the square root of the variances in the rotated
| coordinate system.
|
|   The above is more or less definitions of PCA.  When I go to
| generate the ellipse(s), it turns out that I have to use the
| square root of the eigenvalues in order to get ellipses of
| the correct order of magnitude.  This I don't understand.
| Next, the ellipse for the vector x with 2-norm of 1 appears
| to contain far more than 68% of the data points.  This may be
| due to the few points lying outside the ellipse being quite far
| outside, but is still puzzling.  Last, if I want to plot
| ellipses of 75%, 90%, 95%, ..., what factors do I either
| multiply the eigenvalues (square roots of the eigenvalues)
| by (or the vector x)?
|
|   The end result of this should be a octave script or function
| which will take the data and do a 2D plot of the 2 most
| significant components, along with the ellipse that goes
| along with the data points.  I'll gladly donate said script
| to this archive.

Perhaps the following code will be of some help.  It will take a 2x2
matrix and generate a set of points for plotting an ellipse, its major
and minor axes, and the bounding box that just encloses the ellipse.

The 2x2 matrix is one that defines a quadratic form:

[ x1 x2 ] [ a11 a12 ] [ x1 ]
[ a21 a22 ] [ x2 ]

= a11*x1^2 + (a12+a21)*x1*x2 + a22*x2^2

The level is the value of the this expression along the contour.

jwe

## Plotting an ellipse with an aspect ratio not equal to one can be
## mighty confusing.

gset size ratio 1;

## [x, y, major, minor, bbox] = ellipse (amat, level, n)
##
## Given a 2x2 matrix, generate ellipse data for plotting.

function [x, y, major, minor, bbox] = ellipse (amat, level, n)

if (nargin < 3)
n = 100;
endif

if (nargin > 1)

[v, l] = eig (amat / level);

dl = diag(l);
if (any (imag (dl)) || any (dl <= 0))
error ("ellipse: amat must be positive definite");
endif

## Generate contour data.

a = 1 / sqrt (l(1,1));
b = 1 / sqrt (l(2,2));

t = linspace (0, 2*pi, n)';

xt = a * cos (t);
yt = b * sin (t);

## Rotate the contours.

ra = atan2 (v(2,1), v(1,1));

cos_ra = cos (ra);
sin_ra = sin (ra);

x = xt * cos_ra - yt * sin_ra;
y = xt * sin_ra + yt * cos_ra;

## Endpoints of the major and minor axes.

major = minor = (v * diag ([a, b]))';

major (2,:) = -major (1,:);
minor (1,:) = -minor (2,:);

## Bounding box for the ellipse using magic formula.

ainv = inv (amat);
xbox = sqrt (level * ainv(1,1));
ybox = sqrt (level * ainv(2,2));

bbox = [xbox ybox; xbox -ybox; -xbox -ybox; -xbox ybox; xbox ybox];

else
usage ("ellipse (amat, level, n)");
endif

endfunction

t = rand (100,2);

amat = t' * t;

level = 100;
npts = 181;

[x, y, major, minor, bbox] = ellipse (amat, level, npts);

gset nokey

plot (minor(:,1), minor(:,2), "address@hidden", major(:,1),
major(:,2),"address@hidden",
bbox(:,1), bbox(:,2), "-g", x, y, "-b");

```

reply via email to

 [Prev in Thread] Current Thread [Next in Thread]