[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [igraph] Problem in leading eigenvector community detection
From: |
Tamas Nepusz |
Subject: |
Re: [igraph] Problem in leading eigenvector community detection |
Date: |
Mon, 19 Jul 2010 15:37:47 +0100 |
Hi,
> In the igraph reference manual - the definition of the 'matrix merges' isn't
> clear to me and therefore I don't know what it is doing.
The concept of the merge matrix is pretty simple. Consider that you have, say,
10 nodes in your graph, and the algorithm generated some dendrogram for you.
The dendrogram is built from the bottom up in the merges matrix. Initially, the
dendrogram contains no branches, only the ten isolated nodes at the bottom,
each belonging to a separate branch. These nodes are denoted by the numbers 0
to 9 (since there are 10 nodes). In each step, two nodes become merged. Let us
assume that the first step merges nodes 4 and 5, and the second step merges
nodes 6 and 9. At this point, the first two rows of the merge matrix will
contain the pair 4-5 (in the first row) and the pair 6-9 (in the second row).
So, each row of the merge matrix contains the number of two nodes that were
merged at a given step. When you merge nodes 4 and 5, a new branching point (a
new "node") is created in the dendrogram that connects nodes 4 and 5, this
becomes node 10 (since 10 is the first number that was not used for a node
index). When you merge nodes 6 and 9, another new branching point is created
which will then be denoted by 11. Now, let us assume that the third step in the
merging process merges node 1 with the pair 4-5 (which has already been merged
in the first step). In this case, the merge matrix will contain 1 and 10 in a
row, since 1 is the index of one of the nodes being merged and node 10 denotes
the branching point of the dendrogram that we created when you merged node 4
and node 5.
Generally, if you have N vertices, indices from 0 to N-1 in the merge matrix
denote the original nodes of the graph. Dendrogram branches are denoted by
index N, N+1, N+2, ..., 2*N-2 (since you will have N-1 branches in a dendrogram
with N leaves if each branch connects two nodes). The dendrogram branch denoted
by index X is created in row X-N of the matrix if we consider the rows to be
indexed from zero as well.
Hope this helps.
> Also when i use the code for a network having nodes = 297 , it gives the
> following warning :
> Warning: Maximum number of ARPACK iterations reached in file community.c,
> line 1176
Try increasing the number of iterations ARPACK is allowed to take (see the
mxiter field of igraph_arpack_options_t:
http://igraph.sourceforge.net/doc/html/igraph_arpack_options_t.html). The
default is 3000, which seems to be too low for your graph at some point.
> I'd also like to know what should be the value of 'steps' for large networks
> . Is 'steps = no. of nodes' the only last best possibility?
Yes, it is, unless you want to stop the algorithm prematurely. Note that
setting steps = no_of_nodes does not mean that it is actually going to produce
that many merges and build a full dendrogram; the leading eigenvector will stop
splitting a community further if the community is deemed indivisible. steps is
only an upper limit; if you set this to a lower value, you might end up with a
community structure where some of the communities could be divided further in a
way that increases modularity.
> Does this method give good results for large networks or should I also try
> the other methods(written in igraph) for community detection.
In general, it should give good results for large networks, but since it
involves a whole lot of eigenvector calculations for large matrices, I'd
suggest you also try some of the other methods in igraph, maybe they do just as
well with less number crunching.
--
Tamas