[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [igraph] weak ties / structural holes in Igraph

From: Magnus Thor Torfason
Subject: Re: [igraph] weak ties / structural holes in Igraph
Date: Mon, 26 Sep 2011 11:33:11 -0400
User-agent: Mozilla/5.0 (Windows NT 6.1; rv:6.0.2) Gecko/20110902 Thunderbird/6.0.2

Hi Tamas,

Thanks for the advice. Seeing this code reminds me of a message I sent the list last year about the efficiency of edge subsetting, and Gabor's reply:


I never did manage to get the development version up and running, but it seems that the edge subsetting issue should be resolved in the current version. So I think I'll try it out with the neighborhood.ecount() function.

However, I am very enthusiastic about your comment that the efficiency of graph.neighborhood() has been increased in the development version, since social scientists tend to be interested in a variety of ego network measures (not just the ego network density), and an efficient graph.neighborhood() function would make that much more feasible. And this perhaps applies to subgraph() in general? Going from a copy-then-prune to construct-from-scratch model would make all the difference in efficiency for me. As it is now, graph.neighborhood() would create 100K copies of a 100K node network. Clearly infeasible.

So, the follow up question would be: Any idea when we can expect a release that incorporates the new implementation of graph.neighborhood()?

Thanks for your ongoing support and attention!


ps. I apologize for my R-speak. That's really the only igraph implementation that I know and use, so I'll have to leave the translation into C to others.

pps. Without knowing for sure, I suspect that the reason you went for copy-then-prune, in graph.neighborhood() has to do with preserving the attributes in a simple way. With construct-from-scratch, all the attributes have to be copied individually as well I presume. Which makes me think that perhaps it would not be that much more complicated to allow the user to specify which attributes should be included in the subgraphs. So if a guy like me, with a large network with a bunch of attributes (both edge and vertex) needs to generate a list of subgraphs, to calculate some measures, it would probably improve efficiency in terms of speed and memory if it were possible to pass a list of which attributes to copy to the new graphs - in some cases, no attributes would need to be copied.

On 9/26/2011 4:42, Tamas Nepusz wrote:
Hi Magnus,

The bottleneck in your code is most likely graph.neighborhood, which calls
subgraph() (well, igraph_subgraph in C) in the background. In igraph 0.5.x,
igraph_subgraph() copies the entire graph and deletes the vertices *not* in
the subgraph, which is pretty inefficient for large graphs. This has been
rectified in the development version already, but until then, you should
find an alternative solution that does not involve graph.neighborhood.

One thing that you could try is to derive ego.vcount from degree(g) since
the size of the first-order ego network is always one larger than the degree
of the central vertex. (Alternatively, you could use neighborhood.size).
ego.ecount is trickier, but you can try something like this (untested and
I'm not an expert in R):

neighborhood.ecount<- function(g, vertex) {
   neis<- neighbors(g, vertex, mode="all")
   neis<- c(neis, vertex)
   length(E(g) [ neis %--% neis ])

Applying this function to every element in the range 0:vcount(g)-1 should
then give you the number of edges in the ego network of each vertex.


On 09/23/2011 08:13 PM, Magnus Thor Torfason wrote:
Hi Gabor and others,

It's an interesting thread, I've been doing some work using
ego-network-type measures, including the constraint() measure.

An alternative to get at the aspect of the ego-network structure that
constraint() measures is ego-network-density. It is in some ways more
intuitive than the constraint people and some people prefer it. The
other day I threw together a quick and dirty implementation of this
function (see at the bottom of the mail). However, my implementation
depends crucially on graph.neighborhood(), which did not work very well
for my graph of several hundred thousand vertices.

So I wonder if there is an implementation available for this that allows
me to calculate ego network density for the whole graph without
requiring the generation of vcount(g) sub-graphs. I also wonder if I'm
still correct in assuming that graph.neighborhood() is infeasible to use
for me on a 100K node network?


ps. Thanks again Gabor and Tamas, for igraph and your continued support
of the program and this list.

# Function to calculate ego network density
# Probably not the most efficient, and may not even
# be 100% correct. Caveat emptor.
ego.density = function(g)
      l.ego.graphs              = graph.neighborhood(g,1)
      ego.ecount                = sapply(l.ego.graphs, ecount)
      ego.vcount                = sapply(l.ego.graphs, vcount)
      ego.friend.count          = ego.vcount - 1
      ego.friend.tie.count.max  = ego.friend.count*(ego.friend.count-1)/2
      ego.friend.tie.count.real = ego.ecount - ego.friend.count
      ego.density.result        =

igraph-help mailing list

igraph-help mailing list

reply via email to

[Prev in Thread] Current Thread [Next in Thread]