pspp-users
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: next version of PSPP


From: p666
Subject: Re: next version of PSPP
Date: Fri, 13 Nov 2009 10:06:19 +0100

I think that PSPP' main use is for heavy datafiles processing, where R cannot be used.

PSPP, like SPSS, is a competitor for SAS.

programming capabilities are not on the same level between SAS, SPSS and R or on another level matlab etc...

so what are the strength of these big datafiles systems ?

- the "group by" functionnality is essential, to split data or analysis for further processing with smaller more flexible systemes and for batch processing of hundreds of local analysis
- the report (cross tabulation procedure),
- and a bunch of robust statistical procedures to perform on mass data.
-...

I think that for instance factorial analysis, PCA, MCA etc... is more useful to develop than very precise stepwise regression tools, because these are tools to perform on mass data.

try a PCA on R with too many vars or observations, it hangs !

try a MCA with a huge Burt matrix to fit in R you can't, it's too big.

so I think what should guide development is primarily what statistical/data analysis is useful on big datafiles ?
there should be plenty of things in the days of datamining.

PSPP can be the R complement, or R the PSPP complement for datamining on one side and small precise dataset analysis on the other side.


best regards

PB




2009/11/13 William Simpson <address@hidden>
My two cents

On Thu, Nov 12, 2009 at 10:56 PM, Gene Shackman <address@hidden> wrote:
>
> Sounds like a lot of great work being done on PSPP.  I also add my thanks to those developing the package.
>
> A couple of basic things were these that John mentioned
> * An improved output system.
> * Cut/Paste/Export to/from OpenOffice.org and Koffice.
These are low priorities for me. Basic statistical functionality is at
the top of my list.

>
> and the Anova William mentioned.  Also the regression currently available seems to be forced choice, that is, all factors get put into the equation.  It would be great if there were some selection procedure like forward or backward regression.

This is an advanced procedure that doesn't get treated until grad
school. Therefore it is very low on my list of priorities.

In my opinion, PSPP at this point should be aiming at people with very
basic knowledge and needs. It would be pointless trying to compete
with packages like R (which is what statisticians [and I] use). As
PSPP builds up from the bottom, it can add more and more capabilities.


> Don't forget that you're always welcome to download the latest development
> version - just bear in mind it hasn't been thoroughly tested.  If you just want
> to know the major changes between the released version and the development version,
> you can take a look at the NEWS file.  See: http://git.savannah.gnu.org/cgit/pspp.git/tree/NEWS
I have no experience building from sources under win. I am familiar
and very capable with linux. But my use for PSPP is under win, for
instruction of people with very basic needs.


Again, my two cents
> Additional features which *may* be in the next release include:
>
> * Full UTF8 support.
not even on my list, let alone near the bottom

> * An improved output system.
very low on my list

> * Cut/Paste/Export to/from OpenOffice.org and Koffice.
very low

> * The GRAPH command.
better plots would be nice

> * The FACTOR command.
I guess this means factor analysis. If so, again this is an advanced
technique taught in grad school. I think you should start from ground
up.

> * The GLM command.
If you mean general linear model, bravo! It is by far the most
general, powerful, and widely used statistical approach. A huge amount
of stuff falls under its umbrella. It is fundamental and therefore is
taught from early stages all the way through grad school -- linear
regression and anova.

If you mean generalised linear models, again this is an advanced
technique. If for example you have Bernoulli trials and want a
logistic link, this could be done pretty decently using linear models.
In fact this was done before fairly recently (glm did not exist before
the 1980s).

Thanks again for PSPP!!

Bill


_______________________________________________
Pspp-users mailing list
address@hidden
http://lists.gnu.org/mailman/listinfo/pspp-users




reply via email to

[Prev in Thread] Current Thread [Next in Thread]