[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
type 3 patches
From: |
Jason Stover |
Subject: |
type 3 patches |
Date: |
Wed, 12 Jan 2011 21:57:20 -0500 |
User-agent: |
Mutt/1.5.18 (2008-05-17) |
Two patches for type 3 sums of squares are below. I'm not sure
which is best.
First, notice computation of type 3 sums of squares requires
computation of many least-squares surfaces (one for each explanatory
variable).
One patch changes glm.c only to allow the computation. The other adds a
function to linreg.c. The one for glm.c alone avoids teaching glm.c
about linreg.c, but doesn't provide a function that other procedures
can call. The patch to linreg.c allows procedures to just call
linreg_type3_ss, but requires those procedures to use linreg.c.
linreg.c seems kind of clunky now (I had such high hopes for it), so
if it seems best to have a function in src/math that can do the job
without appealing to linreg.c, then we would need to put that function
in covariance.c or make a new file. But putting it in covariance.c
would mean teaching covariance.c about sweep.c.
So I don't know. But I wonder what other procedures will need to know
about type 3 sums of squares. If they don't compute a linear model, I
don't think their computation of type 3 sums of squares will be
identical to that for the linear model anyway -- that is, they'll have
their own computations, meaning their own functions.
(BTW: I have not tested these beyond compilation.)
Patch below:
diff --git a/src/language/stats/glm.c b/src/language/stats/glm.c
index 862f49c..b074c56 100644
--- a/src/language/stats/glm.c
+++ b/src/language/stats/glm.c
@@ -191,8 +191,50 @@ cmd_glm (struct lexer *lexer, struct dataset *ds)
}
static void dump_matrix (const gsl_matrix *m);
+static void get_ssq (gsl_matrix * cov, gsl_vector *ssq);
static void
+get_ssq (gsl_matrix * cov, gsl_vector * ssq)
+{
+ size_t i;
+ size_t j;
+ size_t dropped;
+ size_t dep_column;
+ gsl_matrix * small_cov = NULL;
+
+ small_cov = gsl_matrix_alloc (cov->size1 - 1, cov->size2 - 1);
+ for (dropped = 1; dropped < cov->size1; dropped++)
+ {
+ for (i = 0; i < dropped; i++)
+ {
+ for (j = 0; j < dropped; j++)
+ {
+ gsl_matrix_set (small_cov, i, j, gsl_matrix_get (cov, i, j));
+ }
+ for (j = dropped + 1; j < small_cov->size2; j++)
+ {
+ gsl_matrix_set (small_cov, i, j - 1, gsl_matrix_get (cov, i, j));
+ }
+ }
+ for (i = dropped + 1; i < small_cov->size1; i++)
+ {
+ for (j = 0; j < dropped; j++)
+ {
+ gsl_matrix_set (small_cov, i - 1, j, gsl_matrix_get (cov, i, j));
+ }
+ for (j = dropped + 1; j < small_cov->size2; j++)
+ {
+ gsl_matrix_set (small_cov, i - 1, j - 1, gsl_matrix_get (cov, i,
j));
+ }
+ }
+ reg_sweep (small_cov, 0);
+ gsl_vector_set (result, dropped,
+ gsl_matrix_get (small_cov, 0, 0)
+ - gsl_matrix_get (small_cov, 0, 0));
+ }
+
+}
+static void
run_glm (const struct glm_spec *cmd, struct casereader *input, const struct
dataset *ds)
{
int v;
@@ -255,12 +297,18 @@ run_glm (const struct glm_spec *cmd, struct casereader
*input, const struct data
{
gsl_matrix *cm = covariance_calculate_unnormalized (cov);
+ gsl_vector *ssq;
+
+ ssq = gsl_vector_alloc (cov->size1 - 1);
+
dump_matrix (cm);
ws.total_ssq = gsl_matrix_get (cm, 0, 0);
reg_sweep (cm, 0);
+
+ get_ssq (cov, ssq);
dump_matrix (cm);
diff --git a/src/math/linreg.c b/src/math/linreg.c
index 63d1a0f..83e9fde 100644
--- a/src/math/linreg.c
+++ b/src/math/linreg.c
@@ -386,3 +386,50 @@ linreg_fit (const gsl_matrix *cov, linreg *l)
}
}
+/*
+ Call this function after calling linreg_fit.
+ */
+void
+linreg_type3_ss (const gsl_matrix * cov, linreg * l, gsl_vector * result)
+{
+ size_t i;
+ size_t j;
+ size_t dropped;
+ size_t dep_column;
+ gsl_matrix * small_cov = NULL;
+
+ small_cov = gsl_matrix_alloc (cov->size1 - 1, cov->size2 - 1);
+ for (dropped = 0; dropped < cov->size1; dropped++)
+ {
+ if (dropped != l->dependent_column)
+ {
+ for (i = 0; i < dropped; i++)
+ {
+ for (j = 0; j < dropped; j++)
+ {
+ gsl_matrix_set (small_cov, i, j, gsl_matrix_get (cov, i, j));
+ }
+ for (j = dropped + 1; j < small_cov->size2; j++)
+ {
+ gsl_matrix_set (small_cov, i, j - 1, gsl_matrix_get (cov, i,
j));
+ }
+ }
+ for (i = dropped + 1; i < small_cov->size1; i++)
+ {
+ for (j = 0; j < dropped; j++)
+ {
+ gsl_matrix_set (small_cov, i - 1, j, gsl_matrix_get (cov, i,
j));
+ }
+ for (j = dropped + 1; j < small_cov->size2; j++)
+ {
+ gsl_matrix_set (small_cov, i - 1, j - 1, gsl_matrix_get (cov,
i, j));
+ }
+ }
+ }
+ dep_column = (l->dependent_column < dropped) ? l->dependent_column :
(l->dependent_column - 1);
+ reg_sweep (small_cov, dep_column);
+ gsl_vector_set (result, dropped,
+ gsl_matrix_get (small_cov, small_cov->size1 - 1,
small_cov->size2 - 1)
+ - linreg_sse (l));
+ }
+}
- type 3 patches,
Jason Stover <=