[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: your mail
From: |
John Darrington |
Subject: |
Re: your mail |
Date: |
Mon, 25 Jul 2016 07:06:43 +0200 |
User-agent: |
Mutt/1.5.23 (2014-03-12) |
On Mon, Jul 25, 2016 at 01:59:27AM +0000, Jeanette Boyne wrote:
GNU pspp 0.10.2-g654fff
running on Windows 10 Home
2.49 GHz processor
16 GB RAM
141 GB available disk space
I'm loading a data set with about 5 million records and a few hundred
columns. While loading, my system never uses more than 20% of available RAM
or CPU and never more than 2% of disk. (I've done it several times because
I keep running into minor errors in my script.) Is there a recommended way
to use more of my capacity to speed up the process? My paging file is at
the recommended/auto setting.
Optimisation issues like this are hard to answer, because so much depends
on the individual circumstances. First some questions:
* How long does it take you to load such a dataset? (how many minutes?)
* From whence comes that dataset? From a .sav file or something else?
* Are your variable string variables or are they numeric?
Right now I can make these suggestions (but can't promise any will work):
* Use a unix based system instead of windows. (pspp is designed for GNU/Linux
so
it is always going to work better there). Serious use is never going to
be optimal on windows.
* Try increasing the "WORKSPACE" setting (see Section 16.2 of the manual) This
will mean it should use more of your 16GB of RAM.
* If you are using psppire (the GUI), then don't. Instead start pspp from a
terminal and run it in the conventional way. (A lot of processing is wasted
generating pretty screen graphics).
I'd be interested to see how these work for you.
J'
signature.asc
Description: Digital signature
- [no subject], Jeanette Boyne, 2016/07/24
- Re: your mail,
John Darrington <=