[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Wed, 2 May 2007 10:23:10 -0400
On Wed, May 02, 2007 at 09:39:34AM +0800, John Darrington wrote:
> Having thought about this some more and tried a few experiments, I'm
> in favour of approach 2.
> We need not duplicate category.[ch] rather we can generalise it, so
> that the new structure can do the work of both.
Okay. I had made some progress using approach 2, but have been sidetracked
by final exams. I'll get back to it next week.
> On Sun, Apr 15, 2007 at 03:06:17PM -0400, Jason Stover wrote:
> To have a glm procedure, pspp needs a data structure to handle
> interactions. An interaction can be thought of as another variable
> which is a function of two or more variables, usually categorical,
> like this:
> Variable 1 Variable 2 Interaction
> A B AB
> E B EB
> A C AC
> E C EC
> ...etc. The interaction term could be created in one of two ways:
> Either 1) create a new variable in the dictionary that corresponds to
> the interaction, or 2) create a new 'interaction' data structure
> that contains all necessary mappings between existing variables and
> the value of the interaction.
> Approach 1 would add a variable to the dictionary, but would not
> create any more observations in the data set. It would make coding any
> procedures that use interactions easier than approach 2, because doing
> so would mean the procedure doesn't need to know about much special
> code to handle interactions. It would also prevent the need for having
> any more obscure string-values-to-binary-vector code like that in
> category.[ch]. Approach 1 would still require the creation of some
> code to create the interaction, though it may not require the creation
> of a specialized "interaction" data structure to be available for use
> by all procedures.
> Approach 2 doesn't require adding anything to the dictionary, but it
> does mean that any procedures that need to use interactions would have
> to create those interactions themselves. These interactions would
> therefore be lost after the procedure exits, meaning that any other
> procedure that needs interactions would have to recreate
> them. Approach 2 also means writing more code that partly duplicates
> the code already in category.[ch].
> I favor approach number 1, but before I fiddle with the
> dictionary, I thought I should ask.
> PGP Public key ID: 1024D/2DE827B3
> fingerprint = 8797 A26D 0854 2EAB 0285 A290 8A67 719C 2DE8 27B3
> See http://pgp.mit.edu or any PGP keyserver for public key.