[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [igraph] iGraph noob - network size and memory

From: Michael Stauffer \(v\)
Subject: Re: [igraph] iGraph noob - network size and memory
Date: Tue, 15 Jun 2010 13:29:03 -0400


Thanks again, this is very helpful.


>> Initially we'll be working on graphs with millions of nodes
>> and edges, and possibly going up to hundreds of millions. 
>The iGraph's
>> description mentions working very efficiently with millions of edges.
>> Does this extend to 10's or 100's of millions?
>Probably yes, assuming that you have enough memory. I can definitely
>generate an Erdos-Renyi random network with 10 million vertices and 200
>million edges on my computer and run some basic queries on it:
>>>> g= Graph.Erdos_Renyi(n=int(1e7), m=int(2e8))
>>>> print g.maxdegree()
>However, the graph takes up more than 8GB of memory at this stage, and
>since my machine has only 4GB of physical memory, there is heavy
>swapping involved. You'll need plenty of memory and of course a 64-bit
>OS, but otherwise you should be fine.
>> I've poked around a bit and seen some posts that suggest iGraph keeps
>> its network data structure in main memory. Is that correct?
>Yes, it is. It uses an indexed edge list data structure, which is
>separated from the rest of the code in src/type_indexededgelist.c. The
>rest of the code uses a well-defined interface to communicate with the
>internal data structure, so if you need to use a different data
>strcture, you only have to reimplement the functions in
>src/type_indexededgelist.c and recompile igraph.
>> Did something come of the suggestion to work on a branch with a
>> (presumably) disk-based database for the graph data?
>There is a branch on Launchpad that uses a PostgreSQL database as a
>backend. It's not kept in sync with the main branch (I'm not sure if it
>works now or not), it's not part of the official codebase, but 
>it serves
>as a proof of concept:
>Feel free to contact the owner of the branch on Launchpad if you have
>questions. The only catch is that there are some igraph functions which
>explicitly construct an adjacency list representation of the graph in
>memory for sake of efficiency, and this is not part of the basic graph
>interface yet, so for these cases, you will end up with the whole graph
>being loaded in memory anyway. I think the basic interface could be
>modified so that igraph algorithms could send "hints" to the internal
>data structure saying things like "hey, I'm gonna use lots of
>neighbors() queries in the near future", and the internal data 
>could adapt (a database-based backend could ignore these hints while
>an in-memory backend could construct the adjacency list). This requires
>some modifications to the igraph core, but well, if you want to use a
>database, you will need some modifications anyway.

reply via email to

[Prev in Thread] Current Thread [Next in Thread]