Re: [igraph] mismatched closeness values in R

Ju-Sung (Jay) Lee

Re: [igraph] mismatched closeness values in R

Sat, 21 Dec 2013 10:45:24 +0000 (Eur)

Alpine 2.00 (WNT 1167 2008-08-23) |

On Fri, Dec 20, 2013 at 1:05 PM, Ju-Sung Lee <address@hidden> wrote:

The results igraph R's closeness() and centralization.closeness()$res are
different. It appears closeness() uses the sum of shortest path lengths

`whereas centralization.closeness() uses the average (before the
``inversion).
`Was this intentional?

`I guess so. Centralization means maximization imho, and if you invert
``the average distance, then you would need to minimize. But I agree I can
``be confusing.
`

`If by maximizing centralization, you mean how we use centrality, then I
``agree: higher value should indicate more of the centrality. However, both
``functions accomplish this: inversion of sum of shortest paths vs
``inversion of average of shortest paths. When I wrote earlier, I left out
``that closeness() also performs an inversion (but on the sum of shortest
``paths). So, I'm not sure I agree with your second clause: "if you invert
``the average distance, then you would need to minimize". Also, the outputs
``of the two functions are merely off by a multiplicative factor of (n-1)
``and correlate 100%. It's just that I expected the outputs to be identical
``but they're off by a factor of (n-1).
`

`Also, the documentation is confusing. It states the function employs
``the average length of shortest paths (which is what
``centralization.closeness() uses) but the formula does not reflect this
``(but closeness() is consistent with formula).
`

If _does_ employ the average length of shortest paths. And then it
inverts it, AFAIR.

`The centralization.closeness() does that, but according to my
``calculations, closeness() employs inversion of *sum* of shortest path
``lengths. Also, in various reports of the closeness centrality formula
``across the 'net, people seem to report both versions of the formula.
``Here's how you can confirm that the output are off by a factor of (n-1)
``(i.e., the averaging factor):
`

centralization.closeness(graph.star(5,mode="undirected"))$res

[1] 1.0000000 0.5714286 0.5714286 0.5714286 0.5714286

closeness(graph.star(5,mode="undirected"))

[1] 0.2500000 0.1428571 0.1428571 0.1428571 0.1428571

`Also, when I said 'formula does not reflect this', I meant the formula as
``written in the help documentation does not reflect the averaging. From the
``igraph documentation:
`

`"The closeness centrality of a vertex is defined by the inverse of the
``average length of the shortest paths to/from all the other vertices in the
``graph:
`
1/sum( d(v,i), i != v)"
If the formula was to reflect the text above it, it should be written as:
"(n-1)/sum (d(v,i), i != v)"

`This was found under igraph_0.6.5-2 (I haven't upgraded my R yet so I
``don't know if this was fixed in igraph_0.6.6 but the README doesn't
``report any of this).
`

Which README is this?

`Sorry, I mean the NEWS file in the igraph_0.6.6 source package. I also
``just managed to load igraph_0.6.6 and the closeness output disparity is
``still there. FYI, for some reason update.packages('igraph') didn't do
``anything so I had to exit my R session and re-install the package; I don't
``know whether this is an issue with igraph, R, or both, but other people
``seem to have had this problem (with other packages).
`
Jay