octave-maintainers
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: dataframe dereferencing


From: Jaroslav Hajek
Subject: Re: dataframe dereferencing
Date: Fri, 3 Sep 2010 21:53:25 +0200

On Fri, Sep 3, 2010 at 5:40 PM, Judd Storrs <address@hidden> wrote:
> On Fri, Sep 3, 2010 at 2:55 AM, Jaroslav Hajek <address@hidden> wrote:
>>
>> > I see your point--you think it's a performance issue, but I think it is
>> > incorrect to assume that subsetting a dataframe is necessarily
>> > inefficient. Really, that's a question of implementation not semantics.
>> > I
>> > don't think that linguistic novelty is a good approach to
>> > optimization. Two
>> > competing semantic models is a bad thing.
>>
>> Competing? Oh no, these would be just happily co-existing :) Besides,
>> for a dataframe df there are actually two cell conversions, df.cell
>> and df.as.cell, and you need to distinguish between them.
>
> Let's forget about "{}" indexing for now--I need to study the cs-list stuff
> more.  However, my condensed opinion is that the emulated postfix OOP is a
> terrible idea. Honestly, it accomplishes nothing and only ads complexity
> because the language is designed otherwise. What I'm saying is do something
> like ascell(df) or frame2cell(df) instead of df.as.cell() that follows
> patterns used elsewhere in the language and practice. I do not think
> anything is gained by pretending that octave follows postfix-style OO, it's
> just confusing and it's easy to fall into the mental trap of thinking
> that foo.changestate() actually does something to foo. It's more true to
> reality to require foo=changestate(foo).  Whether we like it or not, prefix
> notation has been chosen. The builtin mechanisms for operator overloading
> polymorphism rely on prefix notation. Injecting postfix notation seems
> pointless and my feeling is it will lead to unforeseen difficulties and
> suffering.
>

I think you didn't quite understand my view of this. I don't think of
df.as.cell(I,J) as a method, but rather along the lines that
df.as.cell is a virtual object that holds the dataframe data (without
headers) converted as a cell, which then can normally be indexed.

I basically agree that as_cell (df (I,J)) is more standard, I merely
pointed out that the former has a potential performance advantage. On
the contrary, the function approach has the advantage that you can
create a handle to it. Still I think that there is nothing wrong with
keeping the df.as.cell as a performance shortcut (unless it turns out
to be negligible).

>>
>> > Personally, I think octave's internal function dispatch is
>> > always going to be faster than a cobbled-together m-file-based dispatch.
>>
>> The dispatch is not the problem, the intermediate object is.
>
> Dispatch is the problem,

No.

> but I conflated things by also suggesting using
> "{}" indexing. It's mostly the emulated postfix notation that bothers me.
> "{}" indexing is a separate issue. Following standard octave OOP practice

Wow, a standard? :)

> is
> best because the language supports polymorphism based on prefix notation.

I don't think "polymorphism" actually makes sense in Octave. But yes,
prefix notation is more flexible.

> Creating derived classes with extended functionality will just be easier.

Not always. In the "postfix notation" all calls are handled by a
single function, which can be sometimes an advantage (e.g. if you want
to insert some sort of a hook function into each call).

It's not my package and in fact I didn't even looked much at it, so my
opinion is not really relevant. It's up to Pascal to decide.

-- 
RNDr. Jaroslav Hajek, PhD
computing expert & GNU Octave developer
Aeronautical Research and Test Institute (VZLU)
Prague, Czech Republic
url: www.highegg.matfyz.cz



reply via email to

[Prev in Thread] Current Thread [Next in Thread]