[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [igraph] efficiency
From: |
Tamas Nepusz |
Subject: |
Re: [igraph] efficiency |
Date: |
Mon, 30 Nov 2009 14:19:14 +0000 |
> Yeah, I first put all edge into a list (link information is stored in a .txt
> file, read the file to add edges into a list), then call add_edges() to add
> edges in a batch.
> As I said before, it is slow because list operation in python is very slow
> when the list becomes large.
Hmmmm... I believe that Python's list.append() methods should have an amortized
cost of O(1), which seem to be confirmed by the timeit module:
def measure_append_performance(n):
return timeit.Timer('l.append((1,2))', 'l = []').timeit(n)
10000 calls to append (to build a list of 10000 identical elements) take 2 ms,
10^5 calls take 14 ms, 10^6 calls take 129 ms, 10^7 calls take 1291ms, 10^8
calls take 13862ms, but at this point my laptop starts swapping, so I guess
this is the point where the O(1) approximation breaks down due to memory
management costs. Otherwise the cost seems pretty much linear for me. The same
also applies when I use random floating point values generated using
random.random().
What might happen in your case is:
- either you run out of memory earlier because of other data structures you
might build while loading your graph (I assume this because otherwise you can
simply use Graph.Read_Edgelist to load your edge list directly if your input
file does not contain any other information)
- Python's garbage collector interferes with the loading process somehow and
that's why it takes longer. (timeit.Timer turns off garbage collection while
measuring).
--
Tamas