[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Rethinking octave_idx_type
From: |
Daniel J Sebald |
Subject: |
Re: Rethinking octave_idx_type |
Date: |
Sat, 26 Nov 2016 14:25:56 -0600 |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.7.2 |
On 11/26/2016 01:07 PM, Bernardo Sulzbach wrote:
On 11/25/2016 03:27 PM, Michael D Godfrey wrote:
Instead, it seems that we could define octave_idx_type to be ssize_t
(or ptrdiff_t, I think they are equivalent in practice). Then things
like fread, fwrite, or simple element-by-element array operations that
don't require BLAS or LAPACK functions could work on larger arrays.
This appears to be a significant improvement.
This would also fix numel() calls returning overflown values on big
matrices under 64-bit systems too.
... something like 8-bit image data would fit in 8-32 G and overflow the
index. But if the programmer uses y = f(x) there needs to be enough
memory for two such large matrices. Otherwise one has to use x = f(x),
because certainly operating on individual elements using indexing is slow.
Rik's concern about speed of 32-bit indexing vs 64-bit indexing is a
good one. The answer is sort of CPU- and bus-dependent, but it
certainly seems to me that speed takes precedent over the rarer case of
8+ G matrix/vector size. Certainly there are quite common cases where
the user has > 8G data to process, but typically that is done not by
bringing the whole data record in at once, but by processing using
blocks of data. E.g., the filter routines return a state vector so that
can be used recursively for the next block of data, etc. Being crafty
about efficiently processing data, both CPU and memory, is what it's about.
I wonder if it makes sense to have 64-bit indexing be a different Octave
type, because internally Octave would know when > 32-bit indexing is
needed, e.g., ones(2^17). But that seems unnecessary too.
In fact, let's see what this does:
octave:9> x = ones(bitshift(int64(1), 17));
error: out of memory or dimension too large for Octave's index type
octave:9> x = ones(bitshift(int64(1), 16));
error: out of memory or dimension too large for Octave's index type
octave:9> x = ones(bitshift(int64(1), 15));
OK after the last command my system essentially attempted swapping out
all memory and is now gradually coming back to life.
I like the error message. Perhaps it would make sense to differentiate
those two conditions, i.e., rather than say it is this or that,
something like
error: out of memory
vs.
error: dimension too large for Octave's index type (rebuild using option
--index64)
Of course, one of these has to take precedence under the condition they
are both true. I think the only way to test how much memory is available
in a system is to actually try the malloc, right? So maybe the latter
should take precedence.
My point is that with the above, only the users who know they need
64-bit indexing would build it as such. Assuming that 64-bit indexing
takes a CPU hit compared to 32-bit indexing.
Dan
PS: This struck me as odd:
octave:1> x = ones(2^33);
octave:2>
octave:2> size(x)
ans =
0 0