The first talk you should watch in case you missed it is Prof. Carlos Guestrin keynote, which summarizes whats new in Dato:
An interesting talk from Prof. Mike Franklin from Berkeley AMPLab, about what's new in Berkely AMPLab:
A related talk by Prof. Seif Haridi (SICS) about Flink, a system geared towards stream processing:
Another interesting talk by Prof. Alex Smola attracted big audience. Alex have recently formed a startup around his parameter server work. Unfortunately we did not get permission to release his video yet. I am working on that.
Wes McKiney, the creator of Python pandas used our conference to announce Cloudera's new Ibis project, which is a way to parallelize Python code on top of a Hadoop cluster at scale.
Prof. Chris Re have covered his DeepDive framework. Recently he opened another exciting new startup around providing ML tools for a larger audience. For example PaleoDeepDive allows mining complex information out of pdf papers (including NLP, mining tables, geographical coordinates etc.)
Prof. Jeff Heer from Trifacta and University of Washington presented his recent research on how to improve visualization in a joint research project with Tableau. Multiple layouts and options are explored and a recommendation engine filters the results to present the most attractive and informative to the user.
Prof. Dhruv Batra from Virginia Tech described their visual question answering project, a cool project which answers free text questions on images:
In the startup session, an interesting talk by Stephen Merity from CommonCrawl:
One of the most bizarre applications (in a good sense!) is from compology.us - a US company who is monitoring trash bins using sensors and using GraphLab deep learning to detect the level of trash and optimize the pickup routes.
The last talk I wanted to highlight is the audience favorite: a talk by Amanda Cassari from Concur which shows how to run GraphLab Create on top of Spark: