pspp-dev
[Top][All Lists]

## GLM vs unbalanced designs

 From: John Darrington Subject: GLM vs unbalanced designs Date: Tue, 27 Sep 2011 14:02:02 +0000 User-agent: Mutt/1.5.18 (2008-05-17)

```I thought we had a working GLM, at least for factorial anova. However on  doing
further testing, it appears that whilst it works properly for balanced designs (
ie, those with equal sample sizes),  for designs with unequal sample sizes the
answers are quite different to those from other software.

See below for an example.  Even the Intercept is way off.  Which surprises me
because the intercept shouldn't be aware of any groupings.

I've been scouring the literature and a number of text books to try to find if
there should be a correction for unequal sample sizes.  A number of sources say
that there should be such a 'correction', but on examination, it talks only
weighted means of the groups, which is relevant only if the total mean has been
calculated from group means. If the total mean is counted from individual values
(like we do) there is no distinction.

Does anyone have any ideas about what we need to do different in the face of
non-equal sample sizes?  I've tried the obvious things, like using harmonic
means instead of arithmetic ones. But so far no luck.  And quite why the
intercept
should be different, I don't understand.

J'

data list notable
fixed
/dmethod 1 illum 3 score 5-6.
begin data.
1 1  3
1 1  4
1 1  6
1 1  7
1 2  5
1 2  6
1 2  6
1 2  7
1 2  7
1 3  4
1 3  6
1 3  8
1 3  8
1 4  8
1 4 10
1 4 10
1 4  7
1 4 11
2 1  2
2 1  3
2 1  4
2 2  3
2 2  5
2 2  6
2 2  3
2 3  9
2 3 12
2 3 12
2 3  8
2 4  9
2 4  7
2 4 12
2 4 11
end data.

variable labels score 'Accuracy Score'.

glm score by illum dmethod
/method=sstype(3)
/intercept=include
/criteria=alpha(.05)
/design.

Actual Results:

Tests of Between-Subjects Effects
#===============#=======================#==#===========#=======#====#
#     Source    #Type III Sum of Squares|df|Mean Square|   F   |Sig.#
#===============#=======================#==#===========#=======#====#
#Corrected Model#                184.250| 7|     26.321|  8.061|.000#
#Intercept      #               1589.121| 1|   1589.121|486.690|.000#
#illum          #                150.592| 3|     50.197| 15.374|.000#
#dmethod        #                   .113| 1|       .113|   .035|.854#
#illum * dmethod#                 33.212| 3|     11.071|  3.391|.034#
#Error          #                 81.629|25|      3.265|       |    #
#Total          #               1855.000|33|           |       |    #
#Corrected Total#                265.879|32|           |       |    #
#===============#=======================#==#===========#=======#====#

Expected Results:

Tests of Between-Subjects Effects

Dependent Variable:     Type III        df      Mean Square     F           Sig.
Accuracy Score Source   Sum of Squares

Corrected Model         195.029(b)      7       27.861          9.831      .000
Intercept               1478.432        1       1478.432        521.677    .000
ILLUM                   158.951         3       52.984          18.696     .0001
DMETHOD                 6.176E-02       1       6.176E-02       .022       .8842
ILLUM * DMETHOD         43.991          3       14.664          5.174      .0063
Error                   70.850          25      2.834

Total                   1855.000        33

Corrected Total         265.879         32
--
PGP Public key ID: 1024D/2DE827B3
fingerprint = 8797 A26D 0854 2EAB 0285  A290 8A67 719C 2DE8 27B3
See http://pgp.mit.edu or any PGP keyserver for public key.

```

signature.asc
Description: Digital signature