igraph-help
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [igraph] community detection algorithms


From: Larson, TR
Subject: Re: [igraph] community detection algorithms
Date: Thu, 29 Oct 2009 15:46:20 +0000
User-agent: Thunderbird 2.0.0.23 (Windows/20090812)

OK, I've now done a few more tests on label.propagation.community with my data - just thought you'd be interested to know the results:

I've tried 3 methods to get a consistent membership result, with 100 runs of each method to evaluate consistency:

1) Using membership from the most frequent modularity score result (10000 iterations). This gave the same membership in each of 100 runs, so the most successful. However, the modularity score results over 10000 iterations were not normally distributed - there were 4 distinct frequency maxima, with the most frequent value (i.e. the one used to calculate membership) only having a frequency of about 0.6.

2) Using membership from the highest modularity score result (10000 iterations). Very inconsistent - 3 or 4 different results through the 100 runs.

3) Aggregation of all memberships (no modularity taken into account). The highest number of splits(as expected), but again inconsistent as in method (2).

So I'm sticking with method (1) for now.

thanks
Tony



Tamas Nepusz wrote:
Hi Tony,

Walktrap and fastgreedy give the same membership results every time I run the algorithms. However, label.propagation.community it gives slightly different groupings each time. I have tried setting initial to a set vector (either all 0 or 0:(length(V(g)$name)-1)) and fixed to a vector of all FALSE values in case the algorithm results vary because of random starting values. However, this has no effect.
Unfortunately the label propagation algorithm is inherently unstable;
even when the starting position is completely specified, there are
internal random decisions within the algorithm (e.g., the order in which
the nodes are visited during a single pass of the algorithm, or the
decision a node takes when two different labels appear in its
neighborhood with equal frequency). These random decisions are necessary
to avoid oscillations in the algorithm when it encounters bipartite or
near-bipartite structures within a graph. Section III B in the original
Phys Rev E paper of Raghavan et al (http://arxiv.org/abs/0709.2938)
deals with the problem of aggregating different results into one
consistent solution, but this is just one possible way of dealing with
the problem and it is not implemented in igraph. There are also some
modifications to the original algorithm that try to reduce the
instability of the algorithm, but these are not (yet) implemented in
igraph:

http://arxiv.org/pdf/0910.1154

If you are looking for one single consistent solution with the existing
tools provided by igraph, try running the label propagation algorithm
many times (say, 1000 times) and select a solution based on some
internal quality measure; for instance, the modularity metric (this can
readily be calculated by igraph).

Any comments would be most welcome; also any other suggestions on how to extract related clusters from my graph structure.
Tom Gregorovic has sent a patch earlier to this mailing list which
implements a fast hierarchical community detection algorithm by Blondel
et al [1]. I'm in the process of integrating this algorithm to igraph
and it will be available soon (maybe next week) in one of the igraph
nightly builds. Also, a recent review paper of Fortunato et al gives a
fairly detailed overview of different community detection approaches on
graphs.

[1] http://www.iop.org/EJ/article/1742-5468/2008/10/P10008/jstat8_10_p10008.html
[2] http://arxiv.org/pdf/0906.0612


--




reply via email to

[Prev in Thread] Current Thread [Next in Thread]