Re: dataframe dereferencing

From:

Judd Storrs

Subject:

Date:

Thu, 2 Sep 2010 15:31:25 -0400

On Thu, Sep 2, 2010 at 3:04 PM, Jaroslav Hajek <address@hidden> wrote:

while you have every right to naively expect this, understand that for
cell(x(1:3, 1:2)) the inner _expression_ must result in some kind of
intermediary object (e.g. a sub-dataframe) which is then converted to
cell, while x.cell(1:3, 1:2) may be optimized so as to extract the
proper portion of data to cell directly. Similarly for matrix.

I see your point--you think it's a performance issue, but I think it is incorrect to assume that subsetting a dataframe is necessarily inefficient. Really, that's a question of implementation not semantics. I don't think that linguistic novelty is a good approach to optimization. Two competing semantic models is a bad thing. If performance is a problem, optimize later. Personally, I think octave's internal function dispatch is always going to be faster than a cobbled-together m-file-based dispatch. A different optimization would be to make dataframe perform lazy sub-referencing--e.g. a subframe is a view of the original frame (which could also have memory advantages).

However, I see no reason why dataframe couldn't support conversion to
cell through cell (dataframe) as well.

Well, I don't think we want to go the perl route if we can avoid it...

Before you overload {} or suggest doing it, make sure you understand
the associated cs-list & numel issues.

You're going to have to point me somewhere on this one. I'm proposing it anyway because it's semantically correct.

--judd