igraph-help
[Top][All Lists]

## [igraph] On walktrap clustering and isolated vertices

 From: Jeremy Kun Subject: [igraph] On walktrap clustering and isolated vertices Date: Thu, 22 May 2014 13:14:01 -0500

Hi there,

I've been working with igraph's walktrap clustering function for a while now, but I've noticed some behavior that strikes me as strange. In particular, if I run walktrap on a graph with an isolated vertex, the resulting dendrogram is either empty, raises an exception when I try to print it, or raises an exception when I try to convert it to a clustering via as_clustering(). This use case shows up in my work in a way that's hard to avoid, because I'm essentially sampling graphs from a distribution that gives a modest nonzero probability for a vertex to be isolated.

I've pasted an example use case below that tries to run walktrap on an empty graph on ten vertices, then adds an edge and tries again, then forms the complete graph K_9 plus a single isolated vertex and tries again.

What I don't understand is why these errors are occurring. Do the igraph devs (who are hopefully reading this :)) want to unilaterally ban anyone from doing walktrap on a graph with multiple components? I don't see any technical reasons to do this: even if there's an isolated vertex the walktrap function could add a self-loop, as Pons/Latapy do IIRC, and then the distances between vertices in different components can be defined to be 1+max so that they are merged last in the dendrogram, and in an arbitrary fashion.

Or, does this appear to be a true bug?

Note that when I try to do walktrap with, say, a disjoint union of two small cliques, the as_clustering() produces a good clustering, but printing the dendrogram sill raises an exception. So the issue does appear to be isolated to isolated vertices.

Python 3.3.3 (default, Dec 30 2013, 23:51:18)
[GCC 4.2.1 Compatible Apple LLVM 5.0 (clang-500.2.79)] on darwin
>>> import igraph
>>> G = igraph.Graph(10)
>>> cl1 = G.community_walktrap()
>>> print(cl1)
Dendrogram, 0 elements, 0 merges
>>> cl1.as_clustering()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File ".../igraph/clustering.py", line 959, in as_clustering
num_elts - n)
igraph._igraph.InternalError: Error at ../../../src/community.c:767: `steps' to big or `merges' matrix too short, Invalid value
>>> G[1,2] = 1
>>> cl2 = G.community_walktrap()
>>> print(cl2)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File ".../igraph/clustering.py", line 607, in __str__
return self.summary(verbosity=1)
File ".../igraph/clustering.py", line 673, in summary
width = max(positions)+1
TypeError: unorderable types: int() > NoneType()
>>> cl2.as_clustering()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File ".../igraph/clustering.py", line 959, in as_clustering
num_elts - n)
igraph._igraph.InternalError: Error at ../../../src/community.c:767: `steps' to big or `merges' matrix too short, Invalid value
>>> G.add_edges([(i,j) for i in range(1,10) for j in range(1,10) if i != j])
>>> len(G.es)
73
>>> cl3 = G.community_walktrap()
>>> print(cl3)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File ".../igraph/clustering.py", line 607, in __str__
return self.summary(verbosity=1)
File ".../igraph/clustering.py", line 673, in summary
width = max(positions)+1
TypeError: unorderable types: int() > NoneType()
>>> cl3.as_clustering()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File ".../igraph/clustering.py", line 959, in as_clustering
num_elts - n)
igraph._igraph.InternalError: Error at ../../../src/community.c:767: `steps' to big or `merges' matrix too short, Invalid value

Jeremy Kun
Mathematics PhD Candidate
University of Illinois at Chicago