igraph-help
[Top][All Lists]

## Re: [igraph] (no subject)

 From: Tamás Nepusz Subject: Re: [igraph] (no subject) Date: Thu, 1 Mar 2012 20:35:29 +0100

How is a Graph internally represented?
We currently use an indexed edge list data structure. Basically, we have a list in which the edges are stored in the order of addition. We then build two indexes on top of the edge list; one of the indexes can be used to traverse the edge list in increasing order of the source vertices, the other one can be used to traverse the edge list in increasing order of the target vertices. We also have a second-level index that allows one to jump quickly to the point in the first-level index where the out- or in-neighbors of a particular vertex start. The advantage of this data structure is that most of the query operations are fast (see http://igraph.sourceforge.net/doc/html/ch04s02s02.html for more details), but this comes at a price, since we have to re-build the indices when the graph is mutated (e.g., nodes or edges are added or removed). Things are not as bad as they seem, though, but it is in general advantageous to add/delete nodes or edges in large batches because we can then re-build the index only after all the additions or removals have been performed.

Let's say I create a graph with
`int igraph_create(igraph_t *graph, const igraph_vector_t *edges, igraph_integer_t n, 		  igraph_bool_t directed);`
How can I know which data structures are used to represent the graph internally?
Ideally, you shouldn't worry about that, and you shouldn't rely on that either ;) We may replace the underlying data structure at any time.

Is it then really representad as a Vector of edges?
Yes, but note that the vector is indexed. (If you are interested in the gory details, check out src/type_indexededgelist.c -- you can see that we maintain six vectors in igraph_t; two for the edge list itself, two for the first-level indices and two for the second-level indices). So, for instance, getting the degree of a vertex does not require us to scan the entire vector: we can simply jump to the point in the first-level index where the neighbors of the vertex start, then jump to the point where the neighbors of the _next_ vertex start (since the first-level index is sorted by the source vertices), subtract the two pointers, and then we have the degree. The page I linked above lists the time complexity of the query operations and none of them takes more time than O(d), where d is the average degree of a vertex.