I have now implemented the Lanczos algorithm in GraphLab, as part of
GraphLab's collaborative filtering library.
Here are some performance results using the Netflix data (100M non-zeros):
On a 16 core machine (2.6Ghz Quad-Core AMD Opteron x 2), we get a speedup of about 9. Each Iteration over Netflix data takes 10 seconds using 16 cores (iteration is composed in multiplying by the matrix A and then multiplying by A^T).
Using KDD Cup data (track1), we have 252M non-zeros, and each iteration takes 31.4 seconds on the same 16 cores machine.
To give a concrete performance comparison, I have also computed svds() in Matlab on Netflix data. For extracting 2 eigenvalues it takes 126.2 seconds in Matlab, while in GraphLab the same computation (using 16 cores) takes 21.9 seconds. So we are about x6 times faster than Matlab. (Note that there may be a potential difference in the implemented algorithm so this comparison should be taken with a grain of salt).