like!
Good points. Question: Is it possible to create/save a session-based
journal?
frans
On 19/01/2017 21.40, Alan Mead wrote:
That's a good question. When I learned the syntax, it was the
only way to do it. There are some good resources available online
to address specific questions. I usually google something like
spss syntax compute and that produces a lot of hits for most
questions.
I should say that these days, I do use the GUI to generate the
syntax for virtually all analyses. What I do is to navigate to the
dialog I for the analysis that I want, add some random variables
(for most analyses, you have to find numeric variables), select
the options I want and then paste the syntax. I then edit it to
include the variables I want. If the analysis is just a variable
or two and they are easily found, then it's just as easy to simply
pick that one. But if there are many variables and especially if
they are arranged in the dataset contiguously, then it's far
easier to edit the syntax to change the randomly-selected
variables to something like "X to Y" (which includes all columns
between variable X and variable Y in the dataset, including X and
Y).
I personally also use the GUI exclusively to generate syntax for
reading in raw data (reading SAV files with GET is pretty trivial)
. I try hard to analyze tab-delimited files with the variables
names on top and I've found that those are usually read very well
into PSPP/SPSS. I paste the syntax generated by the GUI so I can
edit it if needed. For example, sometimes it will guess wrong
about a variable type. Or I might want to manually change a
variable name so it matches another file.
So, I write syntax mainly for data manipulation (finding bad data,
scoring, creating new variables from input data, etc.) and that
includes a fairly small number of statements:
execute.
count
compute
recode
value label
variable label
temporary
select if
do if ... else ... end if.
sort cases
get
save
write
format
merge files
Maybe I'm digressing. And maybe these comments are mainly for
people who will be doing long, complex, or very important
analyses. But I highly recommend using syntax (whether generated
by GUI or by hand) for all analyses. For one thing, it's
self-documenting (if you save it)... You can go back later and see
exactly what you did (e.g., what variables were included in
"INDEX"? How was that Likert scale scored? What regions went into
that market segment?) and if you find a problem, you already have
all the syntax you need to re-run the analysis. In fact, PSPP
produces readable output but with SPSS if you don't have a copy of
SPSS then you won't be able to read the output file or the SAV
file. So, the syntax is the only file that will be readable
(they're just text files; you can open them with your favorite
text editor or Notepad/Wordpad on Windows). If you did an SPSS
analysis and saved just the output and SAV data, you might not be
able to read either file years (or months) from now when you no
longer have SPSS.
I also think you should avoid ever modifying existing variables,
so that you can re-run your syntax to reproduce an analysis. (You
could also never over-write a SAV file, so that the modified
variables become part of a new SAV file, but this is fraught with
peril and tends to lead to a series of undocumented but
indistinguishable datasets, DATA1.SAV, DATA2.SAV, etc.... Far
better to document your analysis in syntax and avoid modifying
existing variables by creating new ones.)
Sometimes, you can also re-use old syntax (if you analyze similar
datasets frequently).
Also, I recommend that when feasible (and sometimes it simply
isn't), you should avoid using SAV files. Or only use them as
temporary files, not as permanent storage of data. Instead, your
analysis should begin by reading a "raw" data source and then do
the whole analysis. The reason is that you cannot tell what data
transformation have been applied to the dataset. Whereas if you
read the data from a raw source, you always know that that raw
source data is in it's known original state. This might not be an
issue if your analyses do not require data transformations; but I
find that most of my analyses do require a lot. In those cases,
this isn't a trivial issue.
Once I had an NSF grant which entailed creating an ethics measure
and we used SPSS to score it. It would have been the work of
centuries to re-create the scoring (and verifying it) through a
GUI for each dataset. Instead, I copied a fairly complex chunk of
syntax and adapted it to the names of the variables in the current
dataset. I had a syntax error in one statement because the number
of items had changed. Because it didn't execute, my data were
half scored (half unscored) and I compounded the problem by not
noticing the error and using the score in a later analysis. If
I'd written the syntax to create new scored variables, it wouldn't
be possible for my scored variables to be "half scored" ... some
of them wouldn't exist. And in that case, the missing variables
would have stopped the analysis, instead of allowing erroneous
results (from half-scored data) to be produced.
This is definitely a problem is complex analyses or when
manipulating data, but I'd argue it's a potential source of error
in any analysis that involves any degree of data transformation.
IIRC, PSPP distributes a small example dataset with some kind of
Likert data (customer satisfaction ratings?) and in some version
of that example dataset, one of the items had been reversed (i.e.,
the Likert responses had been swapped to 1->5, 2->4,
4->2, 5->1) and saved. You cannot tell this from the SAV
file (at all). In fact, I'm inferring it from a data analysis,
but it's the only possible way that one Likert item could be so
different. Garbage in, garbage out and you often cannot verify
that a SAV file is not "garbage" unless you've just created it.
You should also be generous in adding comments to your syntax. A
comment is a note to the reader/analyst about the syntax and looks
like this:
* data cleaning code .
* removed item 12 on 17-jan-2017 because it had a poor ITC .
* this is the composite that worked best out of the four we tried.
it's R2 was 0.56.
* scoring for the customer satisfaction Likert responses.
I will admit that syntax requires adhering to the rules of
PSPP/SPSS syntax. You leave off a period at the end or a quote
(or use the wrong quote) and PSPP/SPSS gives you a cryptic error
message. I think this is one of the reasons novice PSPP/SPSS
users avoid syntax, but I think they are handicapping themselves
as a result.
One final thing: one of the main advantages of PSPP is that it's
free (i.e., user-editable) software, which includes the manual.
So if you have modifications to make the manual clearer or to add
examples, I'm sure the developers will be delighted to see your
changes/additions.
-Alan
On 1/19/2017 1:11 PM, Aj Hollenbach
wrote:
Thanks Alan. What is the best approach, in your
opinion, to learning the syntax for these types of
expressions? Again, I wholly relied upon the GUI in SPSS.
I did take a look at the PSPP manual, but did not
immediate see examples of the structure of the syntax.
Thanks,
Allen
_______________________________________________
Pspp-users mailing list
address@hidden
https://lists.gnu.org/mailman/listinfo/pspp-users
--
Alan D. Mead, Ph.D.
President, Talent Algorithms Inc.
science + technology = better workers
http://www.alanmead.org
I've... seen things you people wouldn't believe...
functions on fire in a copy of Orion.
I watched C-Sharp glitter in the dark near a programmable gate.
All those moments will be lost in time, like Ruby... on... Rails... Time for Pi.
--"The Register" user Alister, applying the famous
"Blade Runner" speech to software development
_______________________________________________
Pspp-users mailing list
address@hidden
https://lists.gnu.org/mailman/listinfo/pspp-users
|