help-octave
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Optimising first bwlabeln implementation


From: Jordi Gutiérrez Hermoso
Subject: Optimising first bwlabeln implementation
Date: Sun, 8 May 2011 19:59:32 -0500

I'm not sure if I should be discussing this in the Octave-Forge
mailing list instead, but I figure most people who read the
Octave-Forge list read these too.

Attached is my first implementation of bwlabeln for the image package.
I am freely using C++0x, so if you're using gcc you will need a
somewhat recent version (4.4 works) and pass the -std=c++0x
compilation CXXFLAG to mkoctfile. C++0x could be avoided, but I don't
see a strong reason to do so anymore.

I know this initial implementation is very dirty, but it appears to
work. It's implemented with a union-find structure per the description
of Matlab's own bwlabeln. It uses too much memory and it's slow. After
profiling with oprof, it seems evident that I am spending all the CPU
time on hashing coordinates.

I am asking for advice on how to optimise it.

A quick high-level run-down of what it currently does:

     1) Check arguments (this is very poorly done right now)

     2) Creates an std::set of neighbours to check based on the
        connectivity mask that the user passes, making sure to not
        include redundant neighbours. The connectivity mask is assumed
        to be symmetric for this purpose, although this is not
        currently checked.

     3) Iterates with a linear index through all of the voxels in
        boolNDArray that represents the image

     4) For each voxel, first find it with union-find so that it gets
        assigned to a set of the union-find partition

     5) Check each neighbour in the std::set obtained from the
        connectivity mask, performing a union with the current voxel
        if necessary.

     6) Run again through the union-find structure again resetting
        labels to be sequential and start from 1.

     7) Output the labelled NDArray and the number of labels.

The union-find structure is implemented, perhaps in much greater
generality than necessary, in the union-find.h++ template header.
Notice I'm using a C++0x unordered_map which is doing lots and lots of
hashing. The first obvious optimisation is to not use coordinates but
stick to linear indices, but then this has the problem that figuring
out when a voxel is at a boundary when checking its neighbours becomes
more complicated.

I have another general question, about how to properly use Octave
classes. You'll notice I wrote a bunch of silly operators for
Array<octave_idx_type> because I couldn't find a way to manipulate
coordinates otherwise. This point will become moot once I modify the
code to not use coordinates anymore or at least not as directly, but
just out of curiousity, if I did want to do computations with the
coordinates, how should I have done it?

Thanks in advance, and I hope this is interesting.

- Jordi G. H.

Attachment: bwlabeln.cc
Description: Text Data

Attachment: union-find.h++
Description: Text Data


reply via email to

[Prev in Thread] Current Thread [Next in Thread]