[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
## [igraph] FW: Calculating betweennes centrality in large graphs

**From**: |
Visser, D.E. |

**Subject**: |
[igraph] FW: Calculating betweennes centrality in large graphs |

**Date**: |
Fri, 18 Nov 2011 17:04:55 +0100 |

Hi all!
First of all, let me introduce myself: I am Daniel, a business student from the
Netherlands.
For my thesis I aim to predict "online knowledge sharing" in open source
development communities from users' centrality in a given forum. I plan to ask
stackoverflow.com users to fill in a survey and then calculate their centrality
with igraph.
I'm in possession of a GML graph of the stackoverflow forum that looks like
this:
graph [
directed 1
node [
id 0
label "1"
]
node [
id 1
label "2"
]
Etc.
The GML file is, obviously, quite huge: 650k nodes and millions of edges.
Taking a a cutoff point of 2, calculating betweenness centrality for a single
vertex takes just 15 minutes (I learned that all vertices shouldn't take any
longer), but a cutoff point of 3 already takes an hour to process.
Secondly, I am not much of a programmer, but after some trial and error I was
able to read the graph file in igraph (in Python) and got to
"filename.betweenness(vertices=1, directed=True, cutoff=2)" which yielded the
centrality index for, I think, the vertex with ID 1 (see above).
So here are my questions:
1) I believe when I use "vertices=None" centrality for all nodes is calculated.
How can I get igraph to write a CSV file that contains either (or both) "ID" or
"Label" in the one column, and the corresponding centrality index in the other?
2) Is there anyway to determine what a reliable cutoff point would be? I don't
mind letting my computer run for a couple of days, but preferably not more than
a day or three.
Thanks in advance!
Daniel

**[igraph] FW: Calculating betweennes centrality in large graphs**,
*Visser, D.E.* **<=**