Large Scale Machine Learning and Other Animals: Online SVD/PCA resources

Saturday, March 24, 2012

Online SVD/PCA resources

Last month I was vising Toyota Technological Institure in Chicago, where I was generously hosted by Tamir Hazan and Joseph Keshet. I heard some interesting stuff about large scale SVM from Joseph Keseht which I reported here Additionally I met with Raman Arora who is working on online SVD. I asked Raman to summarize the state-of-the-art research on online SVD and here is what I got from him:

Online PCA - the only work (that I am aware of) which comes with a guarantee is the following paper:

Warmuth, Manfred K. and Kuzmin, Dima. Randomized PCA algorithms with regret bounds that are logarithmic in the dimension. In Advances in Neural Information Processing Systems (NIPS), 2006.

Other references on which we build our work include following classical papers:

E. Oja and J. Karhunen. On stochastic approximation of the eigenvectors and eigenvalues of the expectation of a random matrix. Journal of Mathematical Analysis and Applications. Volume 106. Pages 69-84. 1985.

Terence D. Sanger. Optimal unsupervised learning in a single-layer linear feedforward neural network. Neural Networks. Volume 12. Pages 459473. 1989

and these more recent papers:

Nicol N. Schraudolph, Simon Günter and S. V. N. Vishwanathan. Fast iterative kernel PCA. Advances in Neural Information Processing Systems. 2007.

Kwang In Kim, Matthias O. Franz, and Bernhard Schölkopf. Iterative Kernel Principal Component
Analysis for Image Modeling. IEEE transactions on pattern analysis and machine intelligence. Volume 27, number 9. Pages 1351-1366. 2005.

Dan Yang, Zongming Ma and Andreas Buja, A sparse SVD method for high-dimensional data, 2011.

Whitten, Tibshirani and Hastie, A penalized matrix decomposition, with applications to sparse principal components and canonical correlation analysis, 2009.

Lee, Shen, Huang and Marron, Biclustering via sparse singular value decomposition, 2010.

There is another recent paper by Dan Yang on near optimality of sparse SVD algorithm that she proposed. She described some of her results during a talk she gave at U-Chicago but I couldn't find a copy of her paper online.

I am looking forward to read Raman's paper on online SVD once it is ready.

1 comment:

jovoMarch 31, 2012 at 11:52 PM
the paper entitled "Reliable Eigenspectra for New Generation Surveys" by Budavári, Tamás; Wild, Vivienne; Szalay, Alexander S.; Dobos, László; Yip, Ching-Wa is also pretty awesome imho.
ReplyDelete
Replies

Add comment

Large Scale Machine Learning and Other Animals

Saturday, March 24, 2012

Online SVD/PCA resources

1 comment:

Labels

GraphLab Users Google Group

pagerank

google analytics

syntax