[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [igraph] Efficiently selecting a node in python igraph

From: Tim Althoff
Subject: Re: [igraph] Efficiently selecting a node in python igraph
Date: Sat, 10 May 2014 16:25:33 -0700

Thanks for the comprehensive answer, helped a lot!

On Sat, May 10, 2014 at 12:57 PM, Tamás Nepusz <address@hidden> wrote:

> - find node object based on keyword attribute
Okay, this is a bit undocumented and must be improved for the next version, but here’s the trick: the “name” vertex attribute is indexed g.vs.find() is the right way to go but keep in mind that it has O(n) complexity since node attributes are not indexed -- except the “name” attribute, which is indexed. igraph will take advantage of this if you invoke g.vs.find() with a positional argument only *and* the type of this positional argument is string. (Integers won’t work because g.vs.find(X) will return the node with internal ID equal to X). So, this is what I have on my machine:

In [1]: g=Graph.Barabasi(100000)
In [2]: g.vs["name"] = map(str, range(100000))
In [3]: g.vs["another_attribute"] = map(str, range(100000))
In [4]: %timeit g.vs.find(another_attribute="99999")
100 loops, best of 3: 17.8 ms per loop
In [5]: %timeit g.vs.find(name="99999")
100 loops, best of 3: 18.6 ms per loop


In [6]: %timeit g.vs.find("99999")
100000 loops, best of 3: 2.13 us per loop

The first issue here is that this behaviour is undocumented; the second is that it should work even if you specify g.vs.find(name=“99999”) since the two operations are essentially equivalent. An additional advantage of using a “name” attribute with type string is that you can use these names wherever igraph expects a node or node ID, _without_ having to look up the node explicitly using find(). However, using the “name” attribute comes at a cost since its internal index has to be invalidated every time you modify the graph and then rebuilt the next time you try to look up a node by name.

> - add new node to graph (add_vertex) [since this returns none I need to
> "find" the new node obj]
Here you can take advantage of the fact that the newly added node always receives the first available integer node ID, so its ID will be g.vcount()-1 after the addition. This eliminates the need for .find() since you can simply use g.add_edge(paper_id_node, g.vcount()-1).

> Currently this is very slow. Any tips?
As Gabor said: igraph’s data structure is optimized for static graphs, so adding edges one by one is slower than adding them in batches. Maybe it is faster to do something like this:

id_gen = UniqueIdGenerator()
edgelist = []
for paper_id_1, paper_id_2 in whatever_generator_generates_your_edges:
    edgelist.append((id_gen[paper_id_1], id_gen[paper_id_2]))
g = Graph(edgelist)
g.vs[“name”] = id_gen.values()


reply via email to

[Prev in Thread] Current Thread [Next in Thread]