Wednesday, April 27, 2011

Yahoo! KDD Cup using Graphlab - Track 2?

I was delighted to hear from Suhrid Balakrishnan from AT&T Labs, that he is using GraphLab pmf for factorizing a linear model for Yahoo! KDD cup - track 2. Initially I focused only on track1, but it seems that Graphlab is potentially useful for track2 as well.

Overall, this month I am aware of 20 installations of Graphlab pmf code on various research groups all over the world. Specifically I got feedback from University of Austin, University of the Aegean, University of Macedonia, Carnegie Mellon University and AT&T Labs.

I got several valuable inputs regarding his experience with GraphLab I wanted to share and ask if anyone had the same experience.

1) It is recommended to download latest version from mercurial repository. See explanation:
in my previous post http://bickson.blogspot.com/2011/04/yahoo-kdd-cup-using-graphlab.html
I am constantly improving the matrix factorization code and adapting it to the KDD dataset. 

2) It is better to install itpp first, since Graphlab auto detects it and it saves much later trouble.

3) Suhrid got an interesting error when saving the factorized matrices U,V. It seems that on his Ubuntu system, factor ordering was somehow reversed.
The following matlab code solved this issue:
Ud=reshape(U(:),size(U,2),size(U,1));
Ud=Ud';

I have opened a google group for discussions and questions concerning KDD usage of GraphLab. Everyone is welcome to join:
* Group name: GraphLab KDD
* Group home page: http://groups.google.com/group/graphlab-kdd
* Group email address graphlab-kdd@googlegroups.com

No comments:

Post a Comment