Re: [igraph] parallel processing

Tamás Nepusz

Re: [igraph] parallel processing

Tue, 9 Oct 2012 20:53:18 +0200

>* It was a simplification of my program. I have a data set of network with *
>* quite 43,000,000 edges stored in a special unusual format. I myself should *
>* make an empty graph and append edges. the process is very high and It takes *
>* too long if I want to do sequentially.*
>* *
The problem lies not within doing it sequentially. The problem is that the R
interface of igraph does not _modify_ the graph in-place when you add edges; it
creates a _copy_ of the graph and adds the edges to the copy instead. (That's
why you have to write g <- g + edges(whatever)). Copying the graph is
expensive, especially if you add the edges one by one. The solution is simple:
add your edges in batches; for instance, you can start reading your file and
construct the edge list in a simple R vector. When you reach 1 million edges,
you add all of them at once to g, clear your vector and continue reading the
file. I don't know whether R has an internal limit on the length of vectors; if
it doesn't, you can simply read all your 43,000,000 edges into a long vector of
86,000,000 numbers (two for each edge) and then construct the graph at once by
a single call to the graph() constructor.
Cheers,
Tamas