[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [igraph] Translation of R commands into Python
From: |
Tamas Nepusz |
Subject: |
Re: [igraph] Translation of R commands into Python |
Date: |
Wed, 24 Jun 2009 18:40:07 +0100 |
Some time ago I posted a small R script with a basic analysis of a
real graph. I am trying to do the same with Python, but I am
experiencing some problems, mainly due to the fact that I am not at
ease with the automatically generated documentation of the Python
module and I have not been able to find much online (have I looked
in the wrong places?).
No, I guess you looked at the right place, the only documentation for
the Python interface besides that is a sort-of-tutorial that I started
to write some time ago but never had the chance to finish. It's there
at the moment:
http://www.cs.rhul.ac.uk/home/tamas/development/igraph/tutorial/
What I would really like to see is how these commands for the igraph
R bindings translate into commands for the Python bindings.
[...]
# Now I have the graph as a two-column matrix
Obviously that's a list of tuples in Python as Python doesn't have a
built-in matrix data type. So I assume that graph_input_unweighted
looks like this:
graph_input_unweighted = [(0,1), (1,2), (1,3), (2,4), (1,4)]
I will also assume that you invoked the following command (which loads
igraph into Python's main namespace):
>>> from igraph import *
g <- graph(t(graph_input_unweighted), directed=FALSE)
>>> g = Graph(graph_input_weighted, directed=False)
g <- simplify(g)
>>> g.simplify()
Note that most of the functions you used in R translate to methods of
the Graph class in Python. Also note that many of the methods (e.g.,
simplify(), add_vertices(), to_undirected() and such) modify the graph
in-place, unlike R which copies the graph by default.
#degree of each vertex
dd <- degree(g)
>>> dd = g.degree()
pfit <- power.law.fit(dd)
That's not possible directly as there are no built-in statistical
functions in Python. You can try using Scientific Python for that
(import scipy):
http://www.scipy.org/Cookbook/FittingData#head-5eba0779a34c07f5a596bbcf99dbc7886eac18e5
I have to admit that when I had to do this from Python, I simply used
the RPy module which is able to call R functions directly from
Python. :)
#Now calculate graph diameter
d <- get.diameter(g)
>>> d = g.get_diameter() -- this gives you the vertices in the
longest path
#Now calculate the vertex betweenness
#and find the node with the highest betweenness
ver_bet <- betweenness(g)
>>> ver_bet = g.betweenness()
>>> max_index = ver_bet.index(max(ver_bet))
#Now calculate the clustering coefficients (also called transitivity
of a graph)
clust <- transitivity(g, type="local")
>>> clust = g.transitivity_local_undirected()
See also: g.transitivity_undirected(),
g.transitivity_avglocal_undirected()
# calculate average nearest-neighbour degree
knn <- graph.knn(g)
No such function in the Python interface. Hmmm... it eluded my
attention so far for some reason that such a function exists in the C
core. I will add it soon to the Python interface. Anyway, I think it
can be replicated easily in Python:
# Construct a vector of RunningMean objects
running_means = [RunningMean() for _ in xrange(g.maxdegree())]
# Get the degrees in a vector
degrees = g.degree()
# Loop from 0 to the number of vertices - 1
for i in xrange(g.vcount()):
# Get the degree of the ith vertex
deg = degrees[i]
if deg == 0: continue
# Get the degrees of the neighbours
deg_neighbours = g.vs[g.neighbors(i)].degree()
# Add the degrees to the running mean calculator objects
running_means[deg-1] << deg_neighbours
knnk = [rm.mean for rm in running_means]
I haven't tried it, but something like that should work. The
RunningMean class used above is an auxiliary class in the Python
interface that calculates the mean and variance of whatever numbers
you put into it using the << operator.
#Now get the shortest paths in the network
mean_shortest_path <- average.path.length(g)
>>> mean_shortest_path = g.average_path_length()
M_l <- seq(length(d)-1)
for (i in seq(length(d)-1)){
M_l[i] <- mean(neighborhood.size(g, i, nodes=V(g)))
}
There's no direct counterpart for neighborhood.size in the Python
interface yet. If I am not mistaken, neighborhood.size can be
substituted by calculating the shortest path matrix using
g.shortest_paths() and then for each vertex, you can calculate how
many elements are smaller than or equal to 1 if you want to get the
size of the i-order neighborhood. So (again, untested), something like
that:
M_l = range(d) # strictly speaking, M_l = [0] * d should also be OK
as you overwrite it later
n = g.vcount()
for i in xrange(d):
M_l[i] = 0.0
for v1 in xrange(n):
M_l[i] += len([v2 for v2 in xrange(n) if distances[v1][v2] <=
i])
M_l[i] /= n
--
Tamas