Sunday, October 19, 2014

Interesting dataset from Ancestry.com

Was released in Strata NY. It contains US family records from 1820 onwards. The only data which is anonymized is the last name, all other details are true.

Here is a link.

Apache flink

Stratosphere project is a new system which is a competitor to Spark. Recently it was accepted as an Apache incubation project called Flink. My colleague and friend Prof. Seif Haridi from Stockholm is one of the key researchers working on this project. He is also the PhD advisor of Ali Ghodsi, one of the co-founders of Databricks.

Saturday, October 11, 2014

News from Recsys - GraphLab Create based solution wins second place

Yesterday I met Robi Palovicz at ACM Recsys. Robi is a PhD from The Hungarian Academy of Sciences, supervised by my friend and Colleague Andras Benczur. Robi was leading the team who won 2nd place at the Recsys challenge 2014.

His solution was based on Graphlab Create, and more interesting, it takes only 30 seconds to compute the winning solution with GraphLab Create!
Robi have promised to share their winning solution using an Ipython Notebook. Once it is ready we will publish it on our website and I will post a note here.

Friday, October 10, 2014

Microsoft Research Mountain View Shuts Down- What is going on in Microsoft??

I was an intern there in 2006 under Prof. Dahlia Malkhi - a great research center with some of the leading CS scientists like Leslie Lamport. Recently Microsoft has gone crazy and shutdown their center.

I hear that Facebook and others are celebrating. Everyone is trying to hire those guys.