Recently, Professor Pedro Domingos, one of the top machine learning researchers in the world, wrote a great article in the Communications of the ACM entitled “A Few Useful Things to Know about Machine Learning“. In it, he not only summarizes the general ideas in machine learning in fairly accessible terms, but he also manages to impart most of the things we’ve come to regard as common sense or folk wisdom in the field.
One thing you should worry about applying machine learning to high dimensional data is random correlation. I got a great example from my friend and collaborator Erik Aurell:
Chocolate Consumption, Cognitive Function, and Nobel Laureates
Franz H. Messerli, M.D.,
Chocolate consumption could hypothetically improve cognitive function not only in individuals but in whole populations. Could there be a correlation between a country's level of chocolate consumption and its total number of Nobel laureates per capita?
This hilarious graph show a great correlation between chocolate consumption and Nobel prizes..