[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [changeset] histc
From: |
Jaroslav Hajek |
Subject: |
Re: [changeset] histc |
Date: |
Mon, 9 Mar 2009 13:04:13 +0100 |
On Sun, Mar 8, 2009 at 7:49 PM, John W. Eaton <address@hidden> wrote:
> On 8-Mar-2009, Jaroslav Hajek wrote:
>
> | I don't think so. After all, 3.2 is mainly in John's hands, so if he's
> | OK with it, it's fine. I wonder, John, would you be fine with me
> | optimizing accumarray and improving histc as well.
>
> It's not a problem to add an isolated function like this now. No
> other parts of Octave depend on it, so I don't see how adding it can
> cause a regression. Also, it is a .m file, so if there is a problem
> with it that is not discovered until after the release, it can be
> fixed even by those people who can't easily rebuild Octave.
>
> So do whatever you like about it. But for those people who need the
> function, it is better to have a slow version than not have it at all.
>
> jwe
>
OK, so I optimized accumarray for the default summation case, and
optimized histc using lookup & accumarray.
Short benchmark for accumarray:
m = 1e2;
ns = [1e5 1e6 1e7];
for n = ns
idx = ceil (m*rand (n, 1));
vals = rand (n, 1);
disp (n);
tic; A = accumarray (idx, vals); toc
endfor
with recent tip, I get:
100000
Elapsed time is 0.03557897 seconds.
1000000
Elapsed time is 0.28006887 seconds.
10000000
Elapsed time is 3.13704205 seconds.
with the optimized accumarray, I get:
100000
Elapsed time is 0.004490137 seconds.
1000000
Elapsed time is 0.021013975 seconds.
10000000
Elapsed time is 0.199796915 seconds.
as expected, the speed-up factor increases with problem size: 7.9, 13.3, 15.7.
(it's O(N*log(N)) vs. O(N)).
a short benchmark for histc (classifying 4 millions normal random
numbers into equidistant bins):
n = 4e6;
randn ("state", 1)
data = randn (n, 1);
for ns = [1 2 3 4 5 20 100 500]
bins = linspace (-3, 3, ns);
disp (ns);
tic; cnt = histc (data, bins); toc
endfor
with the previous implementation:
address@hidden:~/devel/octave/patches> ./run-octave -q tt3.m
1
Elapsed time is 0.036 seconds.
2
Elapsed time is 0.12 seconds.
3
Elapsed time is 0.22 seconds.
4
Elapsed time is 0.29 seconds.
5
Elapsed time is 0.38 seconds.
20
Elapsed time is 1.67 seconds.
100
Elapsed time is 8.505 seconds.
500
Elapsed time is 42.66 seconds.
with the new implementation:
address@hidden:~/devel/octave/patches> ./run-octave -q tt3.m
1
Elapsed time is 0.037 seconds.
2
Elapsed time is 0.12 seconds.
3
Elapsed time is 0.22 seconds.
4
Elapsed time is 0.3 seconds.
5
Elapsed time is 0.29 seconds.
20
Elapsed time is 0.355 seconds.
100
Elapsed time is 0.4199 seconds.
500
Elapsed time is 0.4822 seconds.
Again you see that the speed-up increases dramatically with number of
bins. This is O(M*N) vs. O(M*log(N) + N). (M data size, N number of
bins). Actually for very small bin counts (< 4) the lookup approach is
somewhat slower, so I used a split-up strategy that reverts to the
Soren's implementation if bins <= 3.
regards
--
RNDr. Jaroslav Hajek
computing expert & GNU Octave developer
Aeronautical Research and Test Institute (VZLU)
Prague, Czech Republic
url: www.highegg.matfyz.cz