[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: RFC: method for storing data

From: fork
Subject: Re: RFC: method for storing data
Date: Fri, 25 Jun 2010 20:50:56 +0000 (UTC)
User-agent: Loom/3.14 (

CdeMills <Pascal.Dupuis <at>> writes:

> Hello,
> There have been some discussions recently about a wished feature, the
> ability to manipulate easily complex data set. Peoples have spoken about the
> data.frame feature from R, so let's define the problem. 

Hear hear!  Thanks for bringing this up in detail!  I also think Judd has some
good ideas as well.

> To be discussed: are those changes allowable to be applied to the object
> 'struct', or should we go to a specific dataframe object ? In the latter
> case, I have a few ideas to implement it as a new class from a .m files.

'struct' needs to be kept compatible with matlab, so we would need a new class
(why not call it 'dataframe'?)    A class also gives the freedom to define all
sorts of useful metadata along with the basic table.

> Let's open the discussion ...

R has the notion of ordered and unordered factors, which we should respect and
emulate (great for sorting, plus they matter in certain statistical algorithms).

We probably need a class for 1-d factors if we are going down that route.

It would be great if there were a way to transparently reference "real" database
tables (SQLite especially), such that one could manipulate them from a SQL
prompt and have the changes automagically be ready in the dataframe (or actually
dataframe *handle* in this application).  A little object oriented magic, the
ability to pass a DSN to an instantiating function, and some connection metadata
in the class might work great here.

R has different methods for plotting, summarizing, and the like based on the
class of the input.  I would NOT advocate for going down that road, as it leads
to wackiness and complexity and weird results and lots of hard to maintain type
coercion rules (object hierarchies are almost always bad, I think...). 

Though... perhaps we could take the best R ideas and package them up in a 'forge

Can anyone comment on Matlab's dataset arrays and categorical arrays?  Does
anybody use them?  I *don't* think we should emulate TMW here, but I don't have
any experience with this toolbox.

> Regards
> Pascal

Thanks again!

reply via email to

[Prev in Thread] Current Thread [Next in Thread]