pspp-dev
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Calculations while sorting?


From: John Darrington
Subject: Re: Calculations while sorting?
Date: Mon, 28 Jul 2008 17:26:04 +0800
User-agent: Mutt/1.5.13 (2006-08-11)

     On Sun, Jul 27, 2008 at 09:21:56PM -0700, Ben Pfaff wrote:
          
          It would be easy to create a "moments_reader" that does calculations
          like this as side effects and otherwise passes cases through
          without modification.
          
          There would be no need to make this part of sorting.  Just make a
          "moments_reader" out of your data and then pass the
          moments_reader to sort_execute.
     
Having thought about this, I have some more ideas:

1. Ben's suggestion is good idea, but I'd like to generalize upon it.

  In my local dir, I have a number of new modules in src/math with a
similar interface to moments.c   How about we abstract this interface
into a virtual base class (let's call it "struct statistics" for
now).  Then instead of a "moments_reader" we can have  a polymorphic
"statistics_reader" is possible, which takes an array of heterogenous
"struct statistics" object.
     

So far, I envisage:

statistics (virtual class)
||
|+- linear_statistics (virtual class)
|   ||
|   |+- moments
|   +-- extremes
|
+- order_statistics (virtual class)
  |||
  ||+- percentiles
  |+-- trimmed_mean
  +--- tukey_hinges

It gets a bit more complicated than this, because order statistics can
only be calculated from a reader which has already been sorted,
whereas linear statistics can be calculated from any reader.

2. In fact, the data from which order_statistics are calculated must not
 only be sorted, but it must also be unique.  So I'm considering
 adding a new translator which takes a sorted reader and returns a
 reader which delivers sorted and distinct data.  The complication
 here, is that if the dictionary associated with the reader doesn't
 have a weight variable, then the translator must add one.

3. Finally, I can envisage that with this arrangement, a convenience
 function which iterates and destroys a reader without explicitly doing
 anything with the data will be useful.


I shall be away from my email for a few days, but if anyone's got any
feedback on this, I'll discuss it when I get back.

J'


-- 
PGP Public key ID: 1024D/2DE827B3 
fingerprint = 8797 A26D 0854 2EAB 0285  A290 8A67 719C 2DE8 27B3
See http://pgp.mit.edu or any PGP keyserver for public key.


Attachment: signature.asc
Description: Digital signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]