[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [igraph] parallel processing

From: Tamás Nepusz
Subject: Re: [igraph] parallel processing
Date: Tue, 9 Oct 2012 20:53:18 +0200

> It was a simplification of my program. I have a data set of network with 
> quite 43,000,000 edges stored in a special unusual format. I myself should 
> make an empty graph and append edges. the process is very high and It takes 
> too long if I want to do sequentially.

The problem lies not within doing it sequentially. The problem is that the R 
interface of igraph does not _modify_ the graph in-place when you add edges; it 
creates a _copy_ of the graph and adds the edges to the copy instead. (That's 
why you have to write g <- g + edges(whatever)). Copying the graph is 
expensive, especially if you add the edges one by one. The solution is simple: 
add your edges in batches; for instance, you can start reading your file and 
construct the edge list in a simple R vector. When you reach 1 million edges, 
you add all of them at once to g, clear your vector and continue reading the 
file. I don't know whether R has an internal limit on the length of vectors; if 
it doesn't, you can simply read all your 43,000,000 edges into a long vector of 
86,000,000 numbers (two for each edge) and then construct the graph at once by 
a single call to the graph() constructor.


reply via email to

[Prev in Thread] Current Thread [Next in Thread]