pspp-users
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: dummy coding of categorical variables ( Pspp-users Digest, Vol 209,


From: tim.goodspeed
Subject: Re: dummy coding of categorical variables ( Pspp-users Digest, Vol 209, Issue 11 )
Date: Wed, 20 Dec 2023 11:32:53 -0000

Thanks Alan,

The coding is as you describe.  Three variables are currently coded in this 
way. One of them, for example, employment can be FT/PT/none.  In the dataset FT 
= 1, PT = 2, None = 3.
Therefore,
- FT becomes a new variable = 1 if employment = 1
- PT becomes a new variable = 1 if employment = 2
- employment = 3 is not included.  'None' is the reference level.

In the example regression output table I tried to include in the message these 
are RA1SG17A_1 (for FT) and RA1SG17A_2 (for PT), and RA1SG17A_1 is one that is 
producing NaN.  (what's the best way to try and include the regression output 
table in a pspp-users@gnu.org message?)


Tim Goodspeed
+44 (0)7714 136 176    |    @TimGoodspeed

-----Original Message-----
From: pspp-users-bounces+tim.goodspeed=btinternet.com@gnu.org 
<pspp-users-bounces+tim.goodspeed=btinternet.com@gnu.org> On Behalf Of 
pspp-users-request@gnu.org
Sent: Wednesday, December 20, 2023 10:17 AM
To: pspp-users@gnu.org
Subject: Pspp-users Digest, Vol 209, Issue 11

Send Pspp-users mailing list submissions to
        pspp-users@gnu.org

To subscribe or unsubscribe via the World Wide Web, visit
        https://lists.gnu.org/mailman/listinfo/pspp-users
or, via email, send a message with subject or body 'help' to
        pspp-users-request@gnu.org

You can reach the person managing the list at
        pspp-users-owner@gnu.org

When replying, please edit your Subject line so it is more specific than "Re: 
Contents of Pspp-users digest..."


Today's Topics:

   1. Re: dummy coding of categorical variables results in zero
      coefficients and standard errors (Alan Mead)


----------------------------------------------------------------------

Message: 1
Date: Wed, 20 Dec 2023 04:16:44 -0600
From: Alan Mead <amead2@alanmead.org>
To: pspp-users@gnu.org
Subject: Re: dummy coding of categorical variables results in zero
        coefficients and standard errors
Message-ID: <35bbe032-0a39-413a-b632-88cdaa727245@alanmead.org>
Content-Type: text/plain; charset="utf-8"; Format="flowed"

Tim,

NaN looks like a numerical error. I'm curious, how may levels does the variable 
have and how many dummy variables are you using?

If the original variable has K levels, you should have K-1 dummy variables. For 
example, if your variable were location (1=rural, 2=suburban, 3=urban) then you 
would pick one level to be the reference and create two dummy variables, 
perhaps:

recode location (1=1) (else=0) into dum1.

recode location (2=1) (else=0) into dum2.

Then the coefficients of dum1 and dum2 tell you how living in a rural
(dum1) or suburban (dum2) area compares to living in an urban area.

The model won't be defined if you use K variables for K levels.

I notice that both of the zeros are for xxx_1 variables, so that suggested 
possibly not coding the categorical variable correctly. But I don't know if 
that's what you are seeing. You could also get zeros if there were no instances 
of that dummy code, but you shouldn't see NaN values. It could also be another 
problem, or a bug. In fact, I think it's probably a bug to see NaN's...

-Alan


On 12/20/23 3:46 AM, tim.goodspeed@btinternet.com wrote:
>
> A basic stat’s question and a specific PSPP query, please.  Any help 
> gratefully received.  I can’t see this in the archives anywhere 
> (searching for ‘categorical’ and ‘dummy’).
>
> For a linear regression, some variables are categorical and so 
> included using dummy coding (Coding Systems for Categorical Variables 
> in Regression Analysis (ucla.edu) 
> <https://stats.oarc.ucla.edu/spss/faq/coding-systems-for-categorical-variables-in-regression-analysis-2/#:~:text=Categorical%20variables%20require%20special%20attention,entered%20into%20the%20regression%20model.>).
>
> *basic stat’s question: *This results in a zero coefficient and zero 
> standard error for some variables, as shown in the example below.  Is 
> this correct? There is little or no linear relationship to be found?
>
> *specific PSPP query: *if there is little relationship/the coefficient 
> is very small, is there a way to tell PSPP to show the very small 
> value instead of zero?**
>
> Thanks in advance
>
> Table: Model Summary (adjRA1SR1)
>
>       
>
>       
>
>       
>
>       
>
>       
>
> R
>
>       
>
> R Square
>
>       
>
> Adjusted R Square
>
>       
>
> Std. Error of the Estimate
>
>       
>
>       
>
>       
>
> 0.55723
>
>       
>
> 0.310505
>
>       
>
> 0.302797
>
>       
>
> 0.8359
>
>       
>
>       
>
>       
>
>       
>
>       
>
>       
>
>       
>
>       
>
>       
>
>       
>
>       
>
> Table: ANOVA (adjRA1SR1)
>
>       
>
>       
>
>       
>
>       
>
>       
>
>       
>
> Sum of Squares
>
>       
>
> df
>
>       
>
> Mean Square
>
>       
>
> F
>
>       
>
> Sig.
>
>       
>
>       
>
> Regression
>
>       
>
> 619.25791
>
>       
>
> 22
>
>       
>
> 28.148087
>
>       
>
> 40.284698
>
>       
>
> 0
>
>       
>
>       
>
> Residual
>
>       
>
> 1375.0987
>
>       
>
> 1968
>
>       
>
> 0.698729
>
>       
>
>       
>
>       
>
>       
>
> Total
>
>       
>
> 1994.3566
>
>       
>
> 1990
>
>       
>
>       
>
>       
>
>       
>
>       
>
>       
>
>       
>
>       
>
>       
>
>       
>
>       
>
>       
>
> Table: Coefficients (adjRA1SR1)
>
>       
>
>       
>
>       
>
>       
>
>       
>
>       
>
> Unstandardized Coefficients
>
>       
>
> Standardized Coefficients
>
>       
>
> t
>
>       
>
> Sig.
>
>       
>
> 95% Confidence Interval for B
>
>       
>
> B
>
>       
>
> Std. Error
>
>       
>
> Beta
>
>       
>
>       
>
>       
>
> Lower Bound
>
>       
>
> Upper Bound
>
> (Constant)
>
>       
>
> 8.163407
>
>       
>
> 0.310014
>
>       
>
> 0
>
>       
>
> 26.332394
>
>       
>
> 0
>
>       
>
> 7.555417
>
>       
>
> 8.771397
>
> lnSTINC
>
>       
>
> -0.036745
>
>       
>
> 0.011677
>
>       
>
> -0.088107
>
>       
>
> -3.146888
>
>       
>
> 0.002
>
>       
>
> -0.059645
>
>       
>
> -0.013845
>
> RA1PKHSIZ
>
>       
>
> -0.011834
>
>       
>
> 0.016218
>
>       
>
> -0.020561
>
>       
>
> -0.729708
>
>       
>
> 0.466
>
>       
>
> -0.043639
>
>       
>
> 0.019971
>
> RA1PRAGE
>
>       
>
> -0.039326
>
>       
>
> 0.011175
>
>       
>
> -0.550388
>
>       
>
> -3.519082
>
>       
>
> 0
>
>       
>
> -0.061242
>
>       
>
> -0.01741
>
> sqPRAGE
>
>       
>
> 0.000464
>
>       
>
> 0.000109
>
>       
>
> 0.666977
>
>       
>
> 4.258349
>
>       
>
> 0
>
>       
>
> 0.00025
>
>       
>
> 0.000678
>
> RA1PRSEX
>
>       
>
> 0.13709
>
>       
>
> 0.03935
>
>       
>
> 0.068446
>
>       
>
> 3.483888
>
>       
>
> 0.001
>
>       
>
> 0.059918
>
>       
>
> 0.214261
>
> RA1PB19_1
>
>       
>
> 0
>
>       
>
> 0
>
>       
>
> 0
>
>       
>
> NaN
>
>       
>
> NaN
>
>       
>
> 0
>
>       
>
> 0
>
> RA1PB19_2
>
>       
>
> -0.485628
>
>       
>
> 0.170694
>
>       
>
> -0.054029
>
>       
>
> -2.845015
>
>       
>
> 0.004
>
>       
>
> -0.820389
>
>       
>
> -0.150867
>
> RA1PB19_3
>
>       
>
> -0.324574
>
>       
>
> 0.058981
>
>       
>
> -0.109094
>
>       
>
> -5.503011
>
>       
>
> 0
>
>       
>
> -0.440246
>
>       
>
> -0.208902
>
> RA1PB19_4
>
>       
>
> -0.333625
>
>       
>
> 0.089807
>
>       
>
> -0.074169
>
>       
>
> -3.714896
>
>       
>
> 0
>
>       
>
> -0.509752
>
>       
>
> -0.157497
>
> RA1PB1
>
>       
>
> -0.002888
>
>       
>
> 0.008407
>
>       
>
> -0.007002
>
>       
>
> -0.343559
>
>       
>
> 0.731
>
>       
>
> -0.019376
>
>       
>
> 0.0136
>
> RA1SG17A_1
>
>       
>
> 0
>
>       
>
> 0
>
>       
>
> 0
>
>       
>
> NaN
>
>       
>
> NaN
>
>       
>
> 0
>
>       
>
> 0
>
> RA1SG17A_2
>
>       
>
> -0.061221
>
>       
>
> 0.053837
>
>       
>
> -0.021822
>
>       
>
> -1.137147
>
>       
>
> 0.256
>
>       
>
> -0.166804
>
>       
>
> 0.044363
>
> RA1PA1
>
>       
>
> -0.15082
>
>       
>
> 0.022182
>
>       
>
> -0.160102
>
>       
>
> -6.7991
>
>       
>
> 0
>
>       
>
> -0.194324
>
>       
>
> -0.107317
>
> RA1PA2
>
>       
>
> -0.248882
>
>       
>
> 0.024367
>
>       
>
> -0.243609
>
>       
>
> -10.214077
>
>       
>
> 0
>
>       
>
> -0.29667
>
>       
>
> -0.201095
>
> RA1SC1
>
>       
>
> -0.328042
>
>       
>
> 0.073134
>
>       
>
> -0.08782
>
>       
>
> -4.485512
>
>       
>
> 0
>
>       
>
> -0.471469
>
>       
>
> -0.184614
>
> RA1PF3bin
>
>       
>
> 0.003064
>
>       
>
> 0.041159
>
>       
>
> 0.001422
>
>       
>
> 0.074435
>
>       
>
> 0.941
>
>       
>
> -0.077655
>
>       
>
> 0.083783
>
> RA1PF7A_2
>
>       
>
> 0.009538
>
>       
>
> 0.086914
>
>       
>
> 0.002111
>
>       
>
> 0.109735
>
>       
>
> 0.913
>
>       
>
> -0.160917
>
>       
>
> 0.179992
>
> RA1PF7A_3
>
>       
>
> 0.14177
>
>       
>
> 0.166844
>
>       
>
> 0.016081
>
>       
>
> 0.849712
>
>       
>
> 0.396
>
>       
>
> -0.18544
>
>       
>
> 0.468979
>
> RA1PF7A_4
>
>       
>
> -0.104009
>
>       
>
> 0.155971
>
>       
>
> -0.01266
>
>       
>
> -0.666848
>
>       
>
> 0.505
>
>       
>
> -0.409894
>
>       
>
> 0.201877
>
> RA1PF7A_5
>
>       
>
> 0.173309
>
>       
>
> 0.59246
>
>       
>
> 0.005486
>
>       
>
> 0.292525
>
>       
>
> 0.77
>
>       
>
> -0.988606
>
>       
>
> 1.335224
>
> RA1PF7A_6
>
>       
>
> 0.064264
>
>       
>
> 0.080864
>
>       
>
> 0.01504
>
>       
>
> 0.794712
>
>       
>
> 0.427
>
>       
>
> -0.094325
>
>       
>
> 0.222853
>
> RA1PG2
>
>       
>
> -0.350528
>
>       
>
> 0.030049
>
>       
>
> -0.233421
>
>       
>
> -11.66509
>
>       
>
> 0
>
>       
>
> -0.40946
>
>       
>
> -0.291597
>
-- 

Alan D. Mead, Ph.D.
President, Talent Algorithms Inc.

science + technology = better workers

https://talalg.com


Linus' Law: Given enough eyeballs, all bugs are shallow.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: 
<https://lists.gnu.org/archive/html/pspp-users/attachments/20231220/6f690217/attachment.htm>

------------------------------

Subject: Digest Footer

_______________________________________________
Pspp-users mailing list
Pspp-users@gnu.org
https://lists.gnu.org/mailman/listinfo/pspp-users


------------------------------

End of Pspp-users Digest, Vol 209, Issue 11
*******************************************




reply via email to

[Prev in Thread] Current Thread [Next in Thread]