igraph-help
[Top][All Lists]

## Re: [igraph] Working with large networks and how to sample from a graph?

 From: Peter Flom Subject: Re: [igraph] Working with large networks and how to sample from a graph? Date: Wed, 4 Apr 2012 07:37:05 -0400

```Thanks!

Peter

Peter Flom
Peter Flom Consulting
http://www.statisticalanalysisconsulting.com/
http://www.IAmLearningDisabled.com

-----Original Message-----
On Behalf Of Tamás Nepusz
Sent: Wednesday, April 04, 2012 7:27 AM
To: Help for igraph users
Subject: Re: [igraph] Working with large networks and how to sample from a
graph?

> One idea I had was to take a small random sample from the network (say
5,000 nodes) but I am not sure exactly how to do this in igraph.

Well, it depends on how you want to do it. You can try selecting 5000 nodes
randomly from the entire network and then take the subgraph; this is
relatively simple:

library(igraph)
vs <- sample.int(vcount(g), 5000)-1
g2 <- subgraph(g, vs)

However, if your graph is large and sparse enough, there is a chance that
the resulting graph will not be connected at all, and then your estimates
will bear no resemblance at all to the "real" betweenness values.

Another option is to use "snowball sampling", in which you start out from a
selected (and preferably well-connected) node and take the subgraph
consisting of the vertices that are at most k steps away from the seed node.
This can be done with the neighborhood() function, but I think this is
largely equivalent to estimating betweenness by cutting paths after length
k.

Note that there are quite a few papers about estimating betweenness
centrality in very large graphs. I would start reading the following paper
first:

http://www.inf.uni-konstanz.de/algo/publications/bp-celn-06.pdf

Basically, they propose calculating shortest paths only from selected pivot
nodes and then estimate the real betweenness values by numerical
manipulations of the results. igraph implements shortest path calculations
(see get.all.shortest.paths), so in theory it is possible to come up with an
R implementation of their algoritm using igraph. (And if you manage to
implement it, let us know so we can include it in the next version).

Best,
T.

_______________________________________________
igraph-help mailing list