## [igraph] FW: Calculating betweennes centrality in large graphs

 From: Visser, D.E. Subject: [igraph] FW: Calculating betweennes centrality in large graphs Date: Fri, 18 Nov 2011 17:04:55 +0100

```Hi all!

First of all, let me introduce myself: I am Daniel, a business student from the
Netherlands.

For my thesis I aim to predict "online knowledge sharing" in open source
development communities from users' centrality in a given forum. I plan to ask
stackoverflow.com users to fill in a survey and then calculate their centrality
with igraph.
I'm in possession of a GML graph of the stackoverflow forum that looks like
this:

graph [
directed 1
node [
id 0
label "1"
]
node [
id 1
label "2"
]

Etc.

The GML file is, obviously, quite huge: 650k nodes and millions of edges.
Taking a a cutoff point of 2, calculating betweenness centrality for a single
vertex takes just 15 minutes (I learned that all vertices shouldn't take any
longer), but a cutoff point of 3 already takes an hour to process.

Secondly, I am not much of a programmer, but after some trial and error I was
able to read the graph file in igraph (in Python) and got to
"filename.betweenness(vertices=1, directed=True, cutoff=2)" which yielded the
centrality index for, I think, the vertex with ID 1 (see above).

So here are my questions:

1) I believe when I use "vertices=None" centrality for all nodes is calculated.
How can I get igraph to write a CSV file that contains either (or both) "ID" or
"Label" in the one column, and the corresponding centrality index in the other?

2) Is there anyway to determine what a reliable cutoff point would be? I don't
mind letting my computer run for a couple of days, but preferably not more than
a day or three.