Wednesday, March 7, 2012

Large scale SVM (support vector machine)

Not long ago I had the pleasure of visiting Toyota Technical Institute in Chicago.
I had some interesting meeting with Joseph Keshet. We discussed what is the best way to
deploy large scale kernel SVM.

According to Joseph, linear SVM is a pretty much solved problem. State of the art solutions
consists Pegasos and SVMLight see Joachim 2006.

Recently, Joseph have worked on large scale SVMs in two fronts: GPU and MPI.
The GPU kernelized (not linear) can be found here. This paper appeared in KDD 2011.
Here is an image depicting nice speedup they got:




















The second SVM activity by Joseph is how to quickly approximate the kernel matrix using Taylor serias expansion. This work is presented in their arxiv paper. Evaluating the kernel matrix has a major overhead in SVM implementation especially because it is dense.  Previously, I have implemented a large scale SVM solver on up to 1024 cores on IBM BlueGene supercomputer.  About 90% of the time was spent on evaluating the kernel matrix. I wish I had some fancy techniques as Joseph proposes for quickly approximating the kernel matrix...

No comments:

Post a Comment