help-octave
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: octave and postgres examples


From: Olaf Till
Subject: Re: octave and postgres examples
Date: Fri, 21 Jun 2013 10:55:10 +0200
User-agent: Mutt/1.5.20 (2009-06-14)

On Thu, Jun 20, 2013 at 03:02:53PM +0100, richard wrote:
> <snip>
> I've a couple of other questions related to this: 
> Any thoughts on the speed of the these operation?

Transmission speed (Octave <-> Postgresql) is best with binary
transmission (as is used if you use placeholders or the copy command)
as opposed to sending data as strings.

Converting Octave data into Postgresqls binary representation before
transmission (done by the package) has been implemented with
consideration of the speed problem. I'd guess that it is at least not
slower than the best way to convert the data to strings (which
probably would be sprintf with a vector argument).

The 'copy from stdin' command should be better for speed than
'insert', since, a.o., you avoid looping over the rows within Octave.

As for preparing the commandline, I'd guess (not tested) that
sprintf("%i", some_vector) should be faster than multiples of cstrcat.

> I will check, but it 
> would be useful to know where to test?

I don't understand this question.

> Some of the data I look at has ~5000 columns, hence I split this over 
> several postgresql tables. Have you run across problems with getting 
> large tables into octave (I haven't gone there yet)?

Personally I havn't ever used the package with such large amounts of
data. But I've made a test now (Octave 3.6.2 32-bit, Postgresql server
on the same machine as client):

octave:1> a = zeros (1000000000, 3);
error: memory exhausted or requested size too large for range of Octave's index 
type -- trying to return to prompt
octave:1> clear a
octave:5> a = zeros (10000000, 3);
octave:6> conn = pq_connect (setdbopts ("dbname", "test"));
octave:7> pq_exec_params (conn, "create table octave_array (a float8, b float8, 
c float8);")
ans = 0

Note that the time for the following command includes conversion with
num2cell().

octave:8> tic; pq_exec_params (conn, "copy octave_array from stdin with 
binary;", setdbopts ("copy_in_from_variable", true, "copy_in_data", num2cell 
(a))), toc
Elapsed time is 3e+01 seconds.

In getting the data out again, with an all in one command, time is
much longer and there is heavy disk access, indicating thrashing due
to memory exaustion (mh ... I would have thought that several GB RAM
are not limiting in 32-bit Octave ...).

octave:9> clear a
octave:10> tic; a = cell2mat (pq_exec_params (conn, "select * from 
octave_array;").data); toc
Elapsed time is 2e+02 seconds.

Splitting the command works faster:

octave:12> clear a
octave:13> tic; data = pq_exec_params (conn, "select * from octave_array;"); 
toc 
Elapsed time is 13.983022 seconds.
octave:14> tic; a = data.data; toc
Elapsed time is 6.10351562e-05 seconds.

But the conversion from cell to matrix thrashes again:

octave:15> clear data
octave:17> tic; a = cell2mat (a); toc
Elapsed time is 217.457265 seconds.

Repeating the select is again fast:

octave:18> tic; b = pq_exec_params (conn, "select * from octave_array;").data; 
toc
Elapsed time is 21.4396012 seconds.

And without cell2mat, with a freshly started Octave, I can repeat the
whole process and do the select several times with always 9 s
duration. And cell2mat alone, without database in a freshly started
Octave, also thrashes with such data. And even with one tenth of that
data:

octave:1> a = zeros (1000000, 3);
octave:2> tic; a = num2cell (a); toc
Elapsed time is 0.23859 seconds.
octave:3> tic; b = cell2mat (a); toc
Elapsed time is 6.45835 seconds.
octave:4>

So I'd guess it is really cell2mat() which is to be blamed ...

Analogously, you can easily test whether this problem occurs with the
sizes of data you anticipate (i.e. numbers of elements, it would be
tedious to check with a large number of columns). But the problem
seems to be with cell2mat(), not with database. Neverthelesss, I'll
try to find and submit a fix, or at least to submit a bug
report. Since the problem is a general one, the chances to get it
fixed should be good.

On Thu, Jun 20, 2013 at 09:08:38PM +0100, richard wrote:
> <snip>
> Also, I'll hace to do some checks on how NULL fields are handled. 
> I do not recall seeing these within Octave. 

Quoted from 'help pq_exec_params':

     Octaves `NA' corresponds to a Postgresql NULL value (not `NaN',
     which is interpreted as a value of a float type!).

As for Octave floats, database will store NA as NULL in Postgresql,
and NaN as a float NaN. But both should be come out again correctly
from Postgresql to Octave.

Olaf

-- 
public key id EAFE0591, e.g. on x-hkp://pool.sks-keyservers.net

Attachment: signature.asc
Description: Digital signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]