Saturday, August 4, 2012

Data Wrangler

One of the most impressive lectures in our GraphLab workshop was given by Jefferey Heer from the HCI dept in Stanford. Data Wrangler is a visual tool for helping out cleaning large datasets - a time demanding task task which is often ignored when talking about machine learning algorithms. Using Data Wrangler it is posible to visually specify how to clean the data on a small sample of it and generate map/reduce or python scripts automatically that will run on the full dataset.

Here is a quick video preview (the full lecture will be online soon): 
Wrangler Demo Video from Stanford Visualization Group on Vimeo.

Here is a link to the full paper.
By the way, my second advisor, Prof. Joe Hellerstein from Berkeley is also involved in this nice project.

No comments:

Post a Comment