Friday, October 18, 2013

Total subgraph communicability in GraphLab

Our user Jacob Kesinger has implemented total subgraph communicability in GraphLab v2.2.
By the way, Jacob is looking for a data scientist position in the area between San Jose and Palo Alto. So if you are looking for PhD in math, with a lot of applied knowledge, I truly recommend Jacob!
(And tell him I sent you.. :-) Anyway he is his description of the method:

I'm posting to the list to announce that I've implemented Total
Subgraph Communicability, a new centrality measure due to
Benzi&Klymco[0].   For a directed graph with adjacenty matrix A, 

TSC_i = sum_j exp(A)_{ij} = (exp(A)*1)_i.

This code calculates the TSC using an Arnoldi iteration on the Krylov
subspace {b, Ab,A*Ab, A*A*Ab, ...}  due to Saad[1], and using the new
warp engine from Graphlab 2.2 (without which this would have been, at
best, very challenging).

I don't have access to a cluster right now, so I can't test that it performs
properly in the distributed environment, but it works fine on a single
macbook air (and can compute TSC for a graph with 130K nodes and 1.4M
edges in less than a minute under load).  

Small components of large graphs will have bogus answers due to
floating point issues.  To find the exact TSC for a particular node i,
run with "--column i" to find exp(A)*e_i; you will have to sum the
resulting output yourself, however.

0 1
1 2
1 3
2 4
3 4
1 0
2 1
3 1
4 2
4 3
0 5.17784
1 10.3319
2 8.49789
3 8.49789
4 7.96807

You can verify this in python as:
import scipy
import scipy.linalg
A = scipy.array([[0,1,0,0,0],[1,0,1,1,0],[0,1,0,0,1],[0,1,0,0,1],[0,0,1,1,0]])

[0]: Benzi, Michele, and Christine Klymko. Total Communicability as a Centrality Measure. ArXiv e-print, February 27, 2013.

[1]: Saad, Yousef. “Analysis of Some Krylov Subspace Approximations to the Matrix Exponential Operator.” SIAM Journal on Numerical Analysis 29, no. 1 (1992): 209–228.

Thanks Jacob for your great contribution!!

No comments:

Post a Comment