Large Scale Machine Learning and Other Animals: What are the most widely deployed machine learning algorithms?

Thursday, June 9, 2011

What are the most widely deployed machine learning algorithms?

One of the interesting questions is: "what are the most useful machine learning algorithms?".
I did a little survey by looking at the Mahout user mailing list and counting occurrences of keywords. The results I got are shown in the plot above.

It seems that matrix factorization (SVD) is the most widely used algorithm, and then K-means. We have just implemented SVD as a part of the GraphLab Collaborative Filtering library. Anyone who wants to beta test it is welcome!

5 comments:

DanJune 9, 2011 at 8:22 AM
Nice :) But maybe this is a plot of inverse quality of the documentation for each algorithm?
ReplyDelete
Replies
Danny BicksonJune 9, 2011 at 8:34 AM
I think you are wrong - for Mahout's SVD we would get division by zero.. :-)
ReplyDelete
Replies
Ted Dunning ... apparently BayesianJune 9, 2011 at 11:27 PM
I think that this is definitely not a measure of usage. From my experience, the order in descending frequency would be recommendations, K-means, SGD, SVD and frequent itemsets. The first two or three dominate. The order of the later ones is uncertain.
ReplyDelete
Replies
helwyrJune 11, 2011 at 5:39 PM
nice, added to the list http://www.quora.com/What-are-the-top-10-data-mining-or-machine-learning-algorithms
ReplyDelete
Replies
Danny BicksonJune 11, 2011 at 5:44 PM
Thanks!

- Danny Bickson
ReplyDelete
Replies

Add comment

Large Scale Machine Learning and Other Animals

Thursday, June 9, 2011

What are the most widely deployed machine learning algorithms?

5 comments:

Labels

GraphLab Users Google Group

pagerank

google analytics

syntax