[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [igraph] mismatched closeness values in R

From: Ju-Sung (Jay) Lee
Subject: Re: [igraph] mismatched closeness values in R
Date: Sat, 21 Dec 2013 10:45:24 +0000 (Eur)
User-agent: Alpine 2.00 (WNT 1167 2008-08-23)

On Fri, Dec 20, 2013 at 1:05 PM, Ju-Sung Lee <address@hidden> wrote:
The results igraph R's closeness() and centralization.closeness()$res are
different.  It appears closeness() uses the sum of shortest path lengths
whereas centralization.closeness() uses the average (before the inversion).
Was this intentional?

I guess so. Centralization means maximization imho, and if you invert the average distance, then you would need to minimize. But I agree I can be confusing.

If by maximizing centralization, you mean how we use centrality, then I agree: higher value should indicate more of the centrality. However, both functions accomplish this: inversion of sum of shortest paths vs inversion of average of shortest paths. When I wrote earlier, I left out that closeness() also performs an inversion (but on the sum of shortest paths). So, I'm not sure I agree with your second clause: "if you invert the average distance, then you would need to minimize". Also, the outputs of the two functions are merely off by a multiplicative factor of (n-1) and correlate 100%. It's just that I expected the outputs to be identical but they're off by a factor of (n-1).

Also, the documentation is confusing. It states the function employs the average length of shortest paths (which is what centralization.closeness() uses) but the formula does not reflect this (but closeness() is consistent with formula).

If _does_ employ the average length of shortest paths. And then it
inverts it, AFAIR.

The centralization.closeness() does that, but according to my calculations, closeness() employs inversion of *sum* of shortest path lengths. Also, in various reports of the closeness centrality formula across the 'net, people seem to report both versions of the formula. Here's how you can confirm that the output are off by a factor of (n-1) (i.e., the averaging factor):

[1] 1.0000000 0.5714286 0.5714286 0.5714286 0.5714286
[1] 0.2500000 0.1428571 0.1428571 0.1428571 0.1428571

Also, when I said 'formula does not reflect this', I meant the formula as written in the help documentation does not reflect the averaging. From the igraph documentation:

"The closeness centrality of a vertex is defined by the inverse of the average length of the shortest paths to/from all the other vertices in the graph:

1/sum( d(v,i), i != v)"

If the formula was to reflect the text above it, it should be written as:
"(n-1)/sum (d(v,i), i != v)"

This was found under igraph_0.6.5-2 (I haven't upgraded my R yet so I don't know if this was fixed in igraph_0.6.6 but the README doesn't report any of this).

Which README is this?

Sorry, I mean the NEWS file in the igraph_0.6.6 source package. I also just managed to load igraph_0.6.6 and the closeness output disparity is still there. FYI, for some reason update.packages('igraph') didn't do anything so I had to exit my R session and re-install the package; I don't know whether this is an issue with igraph, R, or both, but other people seem to have had this problem (with other packages).


reply via email to

[Prev in Thread] Current Thread [Next in Thread]